I am trying to understand how Python works (because I use it all the time!). To my understanding, when you run something like python script.py, the script is converted to bytecode and then the interpreter/VM/CPython–really just a C Program–reads in the python bytecode and executes the program accordingly.
How is this bytecode read in? Is it similar to how a text file is read in C? I am unsure how the Python code is converted to machine code. Is it the case that the Python interpreter (the python command in the CLI) is really just a precompiled C program that is already converted to machine code and then the python bytecode files are just put through that program? In other words, is my Python program never actually converted into machine code? Is the python interpreter already in machine code, so my script never has to be?
Yes, your understanding is correct. There is basically (very basically) a giant switch statement inside the CPython interpreter that says "if the current opcode is so and so, do this and that".
http://hg.python.org/cpython/file/3.3/Python/ceval.c#l790
Other implementations, like Pypy, have JIT compilation, i.e. they translate Python to machine codes on the fly.
If you want to see the bytecode of some code (whether source code, a live function object or code object, etc.), the dis module will tell you exactly what you need. For example:
>>> dis.dis('i/3')
1 0 LOAD_NAME 0 (i)
3 LOAD_CONST 0 (3)
6 BINARY_TRUE_DIVIDE
7 RETURN_VALUE
The dis docs explain what each bytecode means. For example, LOAD_NAME:
Pushes the value associated with co_names[namei] onto the stack.
To understand this, you have to know that the bytecode interpreter is a virtual stack machine, and what co_names is. The inspect module docs have a nice table showing the most important attributes of the most important internal objects, so you can see that co_names is an attribute of code objects which holds a tuple of names of local variables. In other words, LOAD_NAME 0 pushes the value associated with the 0th local variable (and dis helpfully looks this up and sees that the 0th local variable is named 'i').
And that's enough to see that a string of bytecodes isn't enough; the interpreter also needs the other attributes of the code object, and in some cases attributes of the function object (which is also where the locals and globals environments come from).
The inspect module also has some tools that can help you further in investigating live code.
This is enough to figure out a lot of interesting stuff. For example, you probably know that Python figures out at compile time whether a variable in a function is local, closure, or global, based on whether you assign to it anywhere in the function body (and on any nonlocal or global statements); if you write three different functions and compare their disassembly (and the relevant other attributes) you can pretty easily figure out exactly what it must be doing.
(The one bit that's tricky here is understanding closure cells. To really get this, you will need to have 3 levels of functions, to see how the one in the middle forwards things along for the innermost one.)
To understand how the bytecode is interpreted and how the stack machine works (in CPython), you need to look at the ceval.c source code. The answers by thy435 and eyquem already cover this.
Understanding how pyc files are read only takes a bit more information. Ned Batchelder has a great (if slightly out-of-date) blog post called The structure of .pyc files, that covers all of the tricky and not-well-documented parts. (Note that in 3.3, some of the gory code related to importing has been moved from C to Python, which makes it much easier to follow.) But basically, it's just some header info and the module's code object, serialized by marshal.
To understand how source gets compiled to bytecode, that's the fun part.
Design of CPython's Compiler explains how everything works. (Some of the other sections of the Python Developer's Guide are also useful.)
For the early stuff—tokenizing and parsing—you can just use the ast module to jump right to the point where it's time to do the actual compiling. Then see compile.c for how that AST gets turned into bytecode.
The macros can be a bit tough to work through, but once you grasp the idea of how the compiler uses a stack to descend into blocks, and how it uses those compiler_addop and friends to emit bytecodes at the current level, it all makes sense.
One thing that surprises most people at first is the way functions work. The function definition's body is compiled into a code object. Then the function definition itself is compiled into code (inside the enclosing function body, module, etc.) that, when executed, builds a function object from that code object. (Once you think about how closures must work, it's obvious why it works that way. Each instance of the closure is a separate function object with the same code object.)
And now you're ready to start patching CPython to add your own statements, right? Well, as Changing CPython's Grammar shows, there's a lot of stuff to get right (and there's even more if you need to create new opcodes). You might find it easier to learn PyPy as well as CPython, and start hacking on PyPy first, and only come back to CPython once you know that what you're doing is sensible and doable.
Having read the answer of thg4535, I am sure you will find interesting the following explanations on ceval.c : Hello, ceval.c!
This article is part of a series written by Yaniv Aknin whose I'm sort of a fan: Python's Innards
When we run the python programs: 1_python source code compile with Cpython to the bytecode (bytecode is the binary file with .pyc format which seralize with marshal and it is set of stack structures that solve with pvm) 2_then the pvm (python virtual machine/python interpreter) is stackbase machine (the machine which solve task with stack data structure) which loop inside bytecode line by line and execute it.
What executes the bytecode?
The bytecode tells the Python interpreter which C code to execute.
Related
For various reasons, in one project I generate executable code by means of generating AST from various source files the compiling that to bytecode (though the question could also work for cases where the bytecode is generated directly I guess).
From some experimentation, it looks like the debugger more or less just uses the lineno information embedded in the AST alongside the filename passed to compile in order to provide a representation for the debugger's purposes, however this assumes the code being executed comes from a single on-disk file.
That is not necessarily the case for my project, the executable code can be pieced together from multiple sources, and some or all of these sources may have been fetched over the network, or been retrieved from non-disk storage (e.g. database).
And so my Y questions, which may be the wrong ones (hence the background):
is it possible to provide a memory buffer of some sort, or is it necessary to generate a singular on-disk representation of the "virtual source"?
how well would the debugger deal with jumping around between the different bits and pieces if the virtual source can't or should not be linearised[0]
and just in case, is the assumption of Python only supporting a single contiguous source file correct or can it actually be fed multiple sources somehow?
[0] for instance a web-style literate program would be debugged in its original form, jumping between the code sections, not in the so-called "tangled" form
Some of this can be handled by the trepan3k debugger. For other things various hooks are in place.
First of all it can debug based on bytecode alone. But of course stepping instructions won't be possible if the line number table doesn't exist. And for that reason if for no other, I would add a "line number" for each logical stopping point, such as at the beginning of statements. The numbers don't have to be line numbers, they could just count from 1 or be indexes into some other table. This is more or less how go's Pos type position works.
The debugger will let you set a breakpoint on a function, but that function has to exist and when you start any python program most of the functions you define don't exist. So the typically way to do this is to modify the source to call the debugger at some point. In trepan3k the lingo for this is:
from trepan.api import debug; debug()
Do that in a place where the other functions you want to break on and that have been defined.
And the functions can be specified as methods on existing variables, e.g. self.my_function()
One of the advanced features of this debugger is that will decompile the bytecode to produce source code. There is a command called deparse which will show you the context around where you are currently stopped.
Deparsing bytecode though is a bit difficult so depending on which kind of bytecode you get the results may vary.
As for the virtual source problem, well that situation is somewhat tolerated in the debugger, since that kind of thing has to go on when there is no source. And to facilitate this and remote debugging (where the file locations locally and remotely can be different), we allow for filename remapping.
Another library pyficache is used to for this remapping; it has the ability I believe remap contiguous lines of one file into lines in another file. And I think you could use this over and over again. However so far there hasn't been need for this. And that code is pretty old. So someone would have to beef up trepan3k here.
Lastly, related to trepan3k is a trepan-xpy which is a CPython bytecode debugger which can step bytecode instructions even when the line number table is empty.
Suppose I have the following code in Python:
a = "WelcomeToTheMachine"
if a == "DarkSideOfTheMoon":
awersdfvsdvdcvd
print "done!"
Why doesn't this error? How does it even compile? In Java or C#, this would get spotted during compilation.
Python isn't a compiled language, that's why your code doesn't throw compilation errors.
Python is a byte code interpreted language. Technically the source code gets "compiled" to byte code, but then the byte code is just in time (JIT) compiled if using PyPy or Pyston otherwise it's line by line interpreted.
The workflow is as follows :
Your Python Code -> Compiler -> .pyc file -> Interpreter -> Your Output
Using the standard python runtime What does all this mean? Essentially all the heavy work happens during runtime, unlike with C or C++ where the source code in it's entirety is analyzed and translated to binary at compile time.
During "compiling", python pretty much only checks your syntax. Since awersdfvsdvdcvd is a valid identifier, no error is raised until that line actually gets executed. Just because you use a name which wasn't defined doesn't mean that it couldn't have been defined elsewhere... e.g.:
globals()['awersdfvsdvdcvd'] = 1
earlier in the file would be enough to suppress the NameError that would occur if the line with the misspelled name was executed.
Ok, so can't python just look for globals statements as well? The answer to that is again "no" -- From module "foo", I can add to the globals of module "bar" in similar ways. And python has no way of knowing what modules are or will be imported until it's actually running (I can dynamically import modules at runtime too).
Note that most of the reasons that I'm mentioning for why Python as a language can't give you a warning about these things involve people doing crazy messed up things. There are a number of tools which will warn you about these things (making the assumption that you aren't going to do stupid stuff like that). My favorite is pylint, but just about any python linter should be able to warn you about undefined variables. If you hook a linter up to your editor, most of the time you can catch these bugs before you ever actually run the code.
Because Python is an interpreted language. This means that if Python's interpreter doesn't arrive to that line, it won't produce any error.
There's nothing to spot: It's not an "error" as far as Python-the-language is concerned. You have a perfectly valid Python program. Python is a dynamic language, and the identifiers you're using get resolved at runtime.
An equivalent program written in C#, Java or C++ would be invalid, and thus the compilation would fail, since in all those languages the use of an undefined identifier is required to produce a diagnostic to the user (i.e. a compile-time error). In Python, it's simply not known whether that identifier is known or not at compile time. So the code is valid. Think of it this way: in Python, having the address of a construction site (a name) doesn't require the construction to have even started yet. What's important is that by the time you use the address (name) as if there was a building there, there better be a building or else an exception is raised :)
In Python, the following happens:
a = "WelcomeToTheMachine" looks up the enclosing context (here: the module context) for the attribute a, and sets the attribute 'a' to the given string object stored in a pool of constants. It also caches the attribute reference so the subsequent accesses to a will be quicker.
if a == "DarkSideOfTheMoon": finds the a in the cache, and executes a binary comparison operator on object a. This ends up in builtins.str.__eq__. The value returned from this operator is used to control the program flow.
awersdfvsdvdcvd is an expression, whose value is the result of a lookup of the name 'awersdfvsdvdcvd'. This expression is evaluted. In your case, the name is not found in the enclosing contexts, and the lookup raises the NameError exception.
This exception propagates to the matching exception handler. Since the handler is outside of all the nested code blocks in the current module, the print function never gets a chance of being called. The Python's built-in exception handler signals the error to the user. The interpreter (a misnomer!) instance has nothing more to do. Since the Python process doesn't try to do anything else after the interpreter instance is done, it terminates.
There's absolutely nothing that says that the program will cause a runtime error. For example, awersdfvsdvdcvd could be set in an enclosing scope before the module is executed, and then no runtime error would be raised. Python allows fine control over the lifetime of a module, and your code could inject the value for awersdfvsdvdcvd after the module has been compiled, but before it got executed. It takes just a few lines of fairly straightforward code to do that.
This is, in fact, one of the many dynamic programming techniques that get used in Python programs. Their judicious use makes possible the kinds of functionality that C++ will not natively get in decades or ever, and that are very cumbersome in both C# and Java. Of course, Python has a performance cost - nothing is free.
If you like to get such problems highlighted at compilation time, there are tools you can easily integrate in an IDE that would spot this problem. E.g. PyCharm has a built-in static checker, and this error would be highlighted with the red squiggly line as expected.
I am trying to understand how Python works (because I use it all the time!). To my understanding, when you run something like python script.py, the script is converted to bytecode and then the interpreter/VM/CPython–really just a C Program–reads in the python bytecode and executes the program accordingly.
How is this bytecode read in? Is it similar to how a text file is read in C? I am unsure how the Python code is converted to machine code. Is it the case that the Python interpreter (the python command in the CLI) is really just a precompiled C program that is already converted to machine code and then the python bytecode files are just put through that program? In other words, is my Python program never actually converted into machine code? Is the python interpreter already in machine code, so my script never has to be?
Yes, your understanding is correct. There is basically (very basically) a giant switch statement inside the CPython interpreter that says "if the current opcode is so and so, do this and that".
http://hg.python.org/cpython/file/3.3/Python/ceval.c#l790
Other implementations, like Pypy, have JIT compilation, i.e. they translate Python to machine codes on the fly.
If you want to see the bytecode of some code (whether source code, a live function object or code object, etc.), the dis module will tell you exactly what you need. For example:
>>> dis.dis('i/3')
1 0 LOAD_NAME 0 (i)
3 LOAD_CONST 0 (3)
6 BINARY_TRUE_DIVIDE
7 RETURN_VALUE
The dis docs explain what each bytecode means. For example, LOAD_NAME:
Pushes the value associated with co_names[namei] onto the stack.
To understand this, you have to know that the bytecode interpreter is a virtual stack machine, and what co_names is. The inspect module docs have a nice table showing the most important attributes of the most important internal objects, so you can see that co_names is an attribute of code objects which holds a tuple of names of local variables. In other words, LOAD_NAME 0 pushes the value associated with the 0th local variable (and dis helpfully looks this up and sees that the 0th local variable is named 'i').
And that's enough to see that a string of bytecodes isn't enough; the interpreter also needs the other attributes of the code object, and in some cases attributes of the function object (which is also where the locals and globals environments come from).
The inspect module also has some tools that can help you further in investigating live code.
This is enough to figure out a lot of interesting stuff. For example, you probably know that Python figures out at compile time whether a variable in a function is local, closure, or global, based on whether you assign to it anywhere in the function body (and on any nonlocal or global statements); if you write three different functions and compare their disassembly (and the relevant other attributes) you can pretty easily figure out exactly what it must be doing.
(The one bit that's tricky here is understanding closure cells. To really get this, you will need to have 3 levels of functions, to see how the one in the middle forwards things along for the innermost one.)
To understand how the bytecode is interpreted and how the stack machine works (in CPython), you need to look at the ceval.c source code. The answers by thy435 and eyquem already cover this.
Understanding how pyc files are read only takes a bit more information. Ned Batchelder has a great (if slightly out-of-date) blog post called The structure of .pyc files, that covers all of the tricky and not-well-documented parts. (Note that in 3.3, some of the gory code related to importing has been moved from C to Python, which makes it much easier to follow.) But basically, it's just some header info and the module's code object, serialized by marshal.
To understand how source gets compiled to bytecode, that's the fun part.
Design of CPython's Compiler explains how everything works. (Some of the other sections of the Python Developer's Guide are also useful.)
For the early stuff—tokenizing and parsing—you can just use the ast module to jump right to the point where it's time to do the actual compiling. Then see compile.c for how that AST gets turned into bytecode.
The macros can be a bit tough to work through, but once you grasp the idea of how the compiler uses a stack to descend into blocks, and how it uses those compiler_addop and friends to emit bytecodes at the current level, it all makes sense.
One thing that surprises most people at first is the way functions work. The function definition's body is compiled into a code object. Then the function definition itself is compiled into code (inside the enclosing function body, module, etc.) that, when executed, builds a function object from that code object. (Once you think about how closures must work, it's obvious why it works that way. Each instance of the closure is a separate function object with the same code object.)
And now you're ready to start patching CPython to add your own statements, right? Well, as Changing CPython's Grammar shows, there's a lot of stuff to get right (and there's even more if you need to create new opcodes). You might find it easier to learn PyPy as well as CPython, and start hacking on PyPy first, and only come back to CPython once you know that what you're doing is sensible and doable.
Having read the answer of thg4535, I am sure you will find interesting the following explanations on ceval.c : Hello, ceval.c!
This article is part of a series written by Yaniv Aknin whose I'm sort of a fan: Python's Innards
When we run the python programs: 1_python source code compile with Cpython to the bytecode (bytecode is the binary file with .pyc format which seralize with marshal and it is set of stack structures that solve with pvm) 2_then the pvm (python virtual machine/python interpreter) is stackbase machine (the machine which solve task with stack data structure) which loop inside bytecode line by line and execute it.
What executes the bytecode?
The bytecode tells the Python interpreter which C code to execute.
Am I getting this straight? Does the PyPy interpreter actually interpret itself and then translate itself?
So here's my current understanding:
RPython's toolchain involves partially executing the program to be translated to get a sort of preprocessed version to annotate and translate.
The PyPy interpreter, running on top of CPython, executes to partially interpret itself, at which point it hands control off to its RPython half, which performs the translation?
If this is true, then this is one of the most mind-bending things I have ever seen.
PyPy's translation process is actually much less conceptually recursive than it sounds.
Really all it is is a Python program that processes Python function/class/other objects (not Python source code) and outputs C code. But of course it doesn't process just any Python objects; it can only handle particular forms, which are what you get if you write your to-be-translated code in RPython.
Since the translation toolchain is a Python program, you can run it on top of any Python interpreter, which obviously includes PyPy's python interpreter. So that's nothing special.
Since it translates RPython objects, you can use it to translate PyPy's python interpreter, which is written in RPython.
But you can't run it on the translation framework itself, which is not RPython. Only PyPy's python interpreter itself is RPython.
Things only get interesting because RPython code is also Python code (but not the reverse), and because RPython doesn't ever "really exist" in source files, but only in memory inside a working Python process that necessarily includes other non-RPython code (there are no "pure-RPython" imports or function definitions, for example, because the translator operates on functions that have already been defined and imported).
Remember that the translation toolchain operates on in-memory Python code objects. Python's execution model means that these don't exist before some Python code has been running. You can imagine that starting the translation process looks a bit like this, if you highly simplify it:
from my_interpreter import main
from pypy import translate
translate(main)
As we all know, just importing main is going to run lots of Python code, including all the other modules my_interpreter imports. But the translation process starts analysing the function object main; it never sees, and doesn't care about, whatever code was executed to come up with main.
One way to think of this is that "programming in RPython" means "writing a Python program which generates an RPython program and then feeds it to the translation process". That's relatively easy to understand and is kind of similar to how many other compilers work (e.g. one way to think of programming in C is that you are essentially writing a C pre-processor program that generates a C program, which is then fed to the C compiler).
Things only get confusing in the PyPy case because all 3 components (the Python program which generates the RPython program, the RPython program, and the translation process) are loaded into the same Python interpreter. This means it's quite possible to have functions that are RPython when called with some arguments and not when called with other arguments, to call helper functions from the translation framework as part of generating your RPython program, and lots of other weird things. So the situation gets rather blurry around the edges, and you can't necessarily divide your source lines cleanly into "RPython to be translated", "Python generating my RPython program" and "handing the RPython program over to the translation framework".
The PyPy interpreter, running on top of CPython, executes to partially
interpret itself
What I think you're alluding to here is PyPy's use of the the flow object space during translation, to do abstract interpretation. Even this isn't as crazy and mind-bending as it seems at first. I'm much less informed about this part of PyPy, but as I understand it:
PyPy implements all of the operations of a Python interpreter by delegating them to an "object space", which contains an implementation of all the basic built in operations. But you can plug in different object spaces to get different effects, and so long as they implement the same "object space" interface the interpreter will still be able to "execute" Python code.
The RPython code objects that the PyPy translation toolchain processes is Python code that could be executed by an interpreter. So PyPy re-uses part of their Python interpreter as part of the translation tool-chain, by plugging in the flow object space. When "executing" code with this object space, the interpreter doesn't actually carry out the operations of the code, it instead produces flow graphs, which are analogous to the sorts of intermediate representation used by many other compilers; it's just a simple machine-manipulable representation of the code, to be further processed. This is how regular (R)Python code objects get turned into the input for the rest of the translation process.
Since the usual thing that is translated with the translation process is PyPy's Python interpreter, it indeed "interprets itself" with the flow object space. But all that really means is that you have a Python program that is processing Python functions, including the ones doing the processing. In itself it isn't any more mind-bending than applying a decorator to itself, or having a wrapper-class wrap an instance of itself (or wrap the class itself).
Um, that got a bit rambly. I hope it helps, anyway, and I hope I haven't said anything inaccurate; please correct me if I have.
Disclaimer: I'm not an expert on PyPy - in particular, I don't understand the details of the RPython translation, I'm only citing stuff that I've read before. For a more specific post on how RPython translation may work, check out this answer.
The answer is, yes, it can (but only after it was first compiled using CPython).
Longer description:
At first it seems highly mind bending and paradoxical, but once you understand it, it's easy. Checkout the answer on Wikipedia.
Bootstrapping in program development began during the 1950s when each program was constructed on paper in decimal code or in binary code, bit by bit (1s and 0s), because there was no high-level computer language, no compiler, no assembler, and no linker. A tiny assembler program was hand-coded for a new computer (for example the IBM 650) which converted a few instructions into binary or decimal code: A1. This simple assembler program was then rewritten in its just-defined assembly language but with extensions that would enable the use of some additional mnemonics for more complex operation codes.
The process is called software bootstrapping. Basically, you build one tool, say a C++ compiler, in a lower language which has already been made (everything at one point had to be coded from binary), say ASM. Now that you have C++ in existence, you can now code a C++ compiler in C++, then use the ASM C++ compiler to compile your new one. After you once have your new compiler compiled, you can now use it to compile itself.
So basically, make the first computer tool ever by hand coding it, use that interpreter to make another slightly better one, and use that one to make a better one, ... And eventually you get all the complex software today! :)
Another interesting case, is the CoffeeScript language, which is written in... CoffeeScript. (Although this use case still requires the use of an external interpreter, namely Node.js)
The PyPy interpreter, running on top of CPython, executes to partially interpret itself, at which point it hands control off to its RPython half, which performs the translation?
You can compile PyPy using an already compiled PyPy interpreter, or you can use CPython to compile it instead. However, since PyPy has a JIT now, it'll be faster to compile PyPy using itself, rather than CPython. (PyPy is now faster than CPython in most cases)
I am a Python newbie coming from a C++ background. While I know it's not Pythonic to try to find a matching concept using my old C++ knowledge, I think this question is still a general question to ask:
Under C++, there is a well known problem called global/static variable initialization order fiasco, due to C++'s inability to decide which global/static variable would be initialized first across compilation units, thus a global/static variable depending on another one in different compilation units might be initialized earlier than its dependency counterparts, and when dependant started to use the services provided by the dependency object, we would have undefined behavior. Here I don't want to go too deep on how C++ solves this problem. :)
On the Python world, I do see uses of global variables, even across different .py files, and one typycal usage case I saw was: initialize one global object in one .py file, and on other .py files, the code just fearlessly start using the global object, assuming that it must have been initialized somewhere else, which under C++ is definitely unaccept by myself, due to the problem I specified above.
I am not sure if the above use case is common practice in Python (Pythonic), and how does Python solve this kind of global variable initialization order problem in general?
Under C++, there is a well known problem called global/static variable initialization order fiasco, due to C++'s inability to decide which global/static variable would be initialized first across compilation units,
I think that statement highlights a key difference between Python and C++: in Python, there is no such thing as different compilation units. What I mean by that is, in C++ (as you know), two different source files might be compiled completely independently from each other, and thus if you compare a line in file A and a line in file B, there is nothing to tell you which will get placed first in the program. It's kind of like the situation with multiple threads: you cannot say whether a particular statement in thread 1 will be executed before or after a particular statement in thread 2. You could say C++ programs are compiled in parallel.
In contrast, in Python, execution begins at the top of one file and proceeds in a well-defined order through each statement in the file, branching out to other files at the points where they are imported. In fact, you could almost think of the import directive as an #include, and in that way you could identify the order of execution of all the lines of code in all the source files in the program. (Well, it's a little more complicated than that, since a module only really gets executed the first time it's imported, and for other reasons.) If C++ programs are compiled in parallel, Python programs are interpreted serially.
Your question also touches on the deeper meaning of modules in Python. A Python module - which is everything that is in a single .py file - is an actual object. Everything declared at "global" scope in a single source file is actually an attribute of that module object. There is no true global scope in Python. (Python programmers often say "global" and in fact there is a global keyword in the language, but it always really refers to the top level of the current module.) I could see that being a bit of a strange concept to get used to coming from a C++ background. It took some getting used to for me, coming from Java, and in this respect Java is a lot more similar to Python than C++ is. (There is also no global scope in Java)
I will mention that in Python it is perfectly normal to use a variable without having any idea whether it has been initialized/defined or not. Well, maybe not normal, but at least acceptable under appropriate circumstances. In Python, trying to use an undefined variable raises a NameError; you don't get arbitrary behavior as you might in C or C++, so you can easily handle the situation. You may see this pattern:
try:
duck.quack()
except NameError:
pass
which does nothing if duck does not exist. Actually, what you'll more commonly see is
try:
duck.quack()
except AttributeError:
pass
which does nothing if duck does not have a method named quack. (AttributeError is the kind of error you get when you try to access an attribute of an object, but the object does not have any attribute by that name.) This is what passes for a type check in Python: we figure that if all we need the duck to do is quack, we can just ask it to quack, and if it does, we don't care whether it's really a duck or not. (It's called duck typing ;-)
Python import executes new Python modules from beginning to end. Subsequent imports only result in a copy of the existing reference in sys.modules, even if still in the middle of importing the module due to a circular import. Module attributes ("global variables" are actually at the module scope) that have been initialized before the circular import will exist.
main.py:
import a
a.py:
var1 = 'foo'
import b
var2 = 'bar'
b.py:
import a
print a.var1 # works
print a.var2 # fails