How to extend built-in classes in python

How to extend built-in classes in python - python

I'm writing some code for an esp8266 micro controller using micro-python and it has some different class as well as some additional methods in the standard built in classes. To allow me to debug on my desktop I've built some helper classes so that the code will run. However I've run into a snag with micro-pythons time function which has a time.sleep_ms method since the standard time.sleep method on micropython does not accept floats. I tried using the following code to extend the built in time class but it fails to import properly. Any thoughts?
class time(time):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def sleep_ms(self, ms):
super().sleep(ms/1000)
This code exists in a file time.py. Secondly I know I'll have issues with having to import time.time that I would like to fix. I also realize I could call this something else and put traps for it in my micro controller code however I would like to avoid any special functions in what's loaded into the controller to save space and cycles.

You're not trying to override a class, you're trying to monkey-patch a module.
First off, if your module is named time.py, it will never be loaded in preference to the built-in time module. Truly built-in (as in compiled into the interpreter core, not just C extension modules that ship with CPython) modules are special, they are always loaded without checking sys.path, so you can't even attempt to shadow the time module, even if you wanted to (you generally don't, and doing so is incredibly ugly). In this case, the built-in time module shadows you; you can't import your module under the plain name time at all, because the built-in will be found without even looking at sys.path.
Secondly, assuming you use a different name and import it for the sole purpose of monkey-patching time (or do something terrible like adding the monkey patch to a custom sitecustomize module, it's not trivial to make the function truly native to the monkey-patched module (defining it in any normal way gives it a scope of the module where it was defined, not the same scope as other functions from the time module). If you don't need it to be "truly" defined as part of time, the simplest approach is just:
import time
def sleep_ms(ms):
return time.sleep(ms / 1000)
time.sleep_ms = sleep_ms
Of course, as mentioned, sleep_ms is still part of your module, and carries your module's scope around with it (that's why you do time.sleep, not just sleep; you could do from time import sleep to avoid qualifying it, but it's still a local alias that might not match time.sleep if someone else monkey-patches time.sleep later).
If you want to make it behave like it's part of the time module, so you can reference arbitrary things in time's namespace without qualification and always see the current function in time, you need to use eval to compile your code in time's scope:
import time
# Compile a string of the function's source to a code object that's not
# attached to any scope at all
# The filename argument is garbage, it's just for exception traceback
# reporting and the like
code = compile('def sleep_ms(ms): sleep(ms / 1000)', 'time.py', 'exec')
# eval the compiled code with a scope of the globals of the time module
# which both binds it to the time module's scope, and inserts the newly
# defined function directly into the time module's globals without
# defining it in your own module at all
eval(code, vars(time))
del code, time # May as well leave your monkey-patch module completely empty

Related

Defining a module from within a module [duplicate]

I'd like to dynamically create a module from a dictionary, and I'm wondering if adding an element to sys.modules is really the best way to do this. EG
context = { a: 1, b: 2 }
import types
test_context_module = types.ModuleType('TestContext', 'Module created to provide a context for tests')
test_context_module.__dict__.update(context)
import sys
sys.modules['TestContext'] = test_context_module
My immediate goal in this regard is to be able to provide a context for timing test execution:
import timeit
timeit.Timer('a + b', 'from TestContext import *')
It seems that there are other ways to do this, since the Timer constructor takes objects as well as strings. I'm still interested in learning how to do this though, since a) it has other potential applications; and b) I'm not sure exactly how to use objects with the Timer constructor; doing so may prove to be less appropriate than this approach in some circumstances.
EDITS/REVELATIONS/PHOOEYS/EUREKA:
I've realized that the example code relating to running timing tests won't actually work, because import * only works at the module level, and the context in which that statement is executed is that of a function in the testit module. In other words, the globals dictionary used when executing that code is that of __main__, since that's where I was when I wrote the code in the interactive shell. So that rationale for figuring this out is a bit botched, but it's still a valid question.
I've discovered that the code run in the first set of examples has the undesirable effect that the namespace in which the newly created module's code executes is that of the module in which it was declared, not its own module. This is like way weird, and could lead to all sorts of unexpected rattlesnakeic sketchiness. So I'm pretty sure that this is not how this sort of thing is meant to be done, if it is in fact something that the Guido doth shine upon.
The similar-but-subtly-different case of dynamically loading a module from a file that is not in python's include path is quite easily accomplished using imp.load_source('NewModuleName', 'path/to/module/module_to_load.py'). This does load the module into sys.modules. However this doesn't really answer my question, because really, what if you're running python on an embedded platform with no filesystem?
I'm battling a considerable case of information overload at the moment, so I could be mistaken, but there doesn't seem to be anything in the imp module that's capable of this.
But the question, essentially, at this point is how to set the global (ie module) context for an object. Maybe I should ask that more specifically? And at a larger scope, how to get Python to do this while shoehorning objects into a given module?

Hmm, well one thing I can tell you is that the timeit function actually executes its code using the module's global variables. So in your example, you could write
import timeit
timeit.a = 1
timeit.b = 2
timeit.Timer('a + b').timeit()
and it would work. But that doesn't address your more general problem of defining a module dynamically.
Regarding the module definition problem, it's definitely possible and I think you've stumbled on to pretty much the best way to do it. For reference, the gist of what goes on when Python imports a module is basically the following:
module = imp.new_module(name)
execfile(file, module.__dict__)
That's kind of the same thing you do, except that you load the contents of the module from an existing dictionary instead of a file. (I don't know of any difference between types.ModuleType and imp.new_module other than the docstring, so you can probably use them interchangeably) What you're doing is somewhat akin to writing your own importer, and when you do that, you can certainly expect to mess with sys.modules.
As an aside, even if your import * thing was legal within a function, you might still have problems because oddly enough, the statement you pass to the Timer doesn't seem to recognize its own local variables. I invoked a bit of Python voodoo by the name of extract_context() (it's a function I wrote) to set a and b at the local scope and ran
print timeit.Timer('print locals(); a + b', 'sys.modules["__main__"].extract_context()').timeit()
Sure enough, the printout of locals() included a and b:
{'a': 1, 'b': 2, '_timer': <built-in function time>, '_it': repeat(None, 999999), '_t0': 1277378305.3572791, '_i': None}
but it still complained NameError: global name 'a' is not defined. Weird.

Private functions in python

Is it possible to avoid importing a file with from file import function?
Someone told me i would need to put an underscore as prefix, like: _function, but isn't working.
I'm using Python 2.6 because of a legacy code.

There are ways you can prevent the import, but they're generally hacks and you want to avoid them. The normal method is to just use the underscore:
def _function():
pass
Then, when you import,
from my_module import *
You'll notice that _function is not imported because it begins with an underscore. However, you can always do this:
# In Python culture, this is considered rude
from my_module import _function
But you're not supposed to do that. Just don't do that, and you'll be fine. Python's attitude is that we're all adults. There are a lot of other things you're not supposed to do which are far worse, like
import my_module
# Remove the definition for _function!
del my_module._function

There is no privacy in Python. There are only conventions governing what external code should consider publicly accessible and usable.
Importing a module for the first time, triggers the creation of a module object and the execution of all top-level code in the module. The module object contains the global namespace with the result of that code having run.
Because Python is dynamic you can always introspect the module namespace; you can see all names defined, all objects those names reference, and you can access and alter everything. It doesn't matter here if those names start with underscores or not.
So the only reason you use a leading _ underscore for a name, is to document that the name is internal to the implementation of the module, and that external code should not rely on that name existing in a future version. The from module import * syntax will ignore such names for that reason alone. But you can't prevent a determined programmer from accessing such a name anyway. They simply do so at their own risk, it is not your responsibility to keep them from that access.
If you have functions or other objects that are only needed to initialise the module, you are of course free to delete those names at the end.

What is the mechanism that allows Python monkey patching in this instance?

Can someone explain the logic behind how this works with the Python interpreter? Is this behavior only thread local? Why does the assignment in the first module import persist after the second module import? I just had a long debugging session that came down to this.
external_library.py
def the_best():
print "The best!"
modify_external_library.py
import external_library
def the_best_2():
print "The best 2!"
external_library.the_best = the_best_2
main.py
import modify_external_library
import external_library
external_library.the_best()
Test:
$ python main.py
The best 2!

Nothing thread-local about this. somemodule.anattr = avalue is very global behavior! After this assignment the attribute is changed for good (until maybe changed back later) no matter what.
There's no mysterious mechanics at play! Assignment to any attribute of an object that allows such assignment (as module objects do) just work in the obvious way -- no thread-local anything, nothing strange -- and assignment to attribute persists, as long as the object whose attribute you've assigned persists, of course.
The repeated import external_library doesn't reload the module (reload is a totally separate builtin and import does not call it!) -- it just checks sys.modules, finds an external_library key in that dict, and binds the corresponding value (which was previously modified by that assignment) to name external_library in the appropriate namespace (here, globals of module main).

As Alex indicated you need to reload external_library, simply importing it will do nothing if it's already been imported. You can check that by putting print statements into your external_library and modify_external_library modules.
import modify_external_library
#import external_library
reload(external_library)
external_library.the_best()
output
The best!

Modules are instances of new-style classes. When you modify the attributes of a module (the function in this case), you are modifying the module instance. When you try to import it again (with import external_library), you're just getting the same module object already referenced inside of modify_external_library.py.
Edit: Of course, trying to import the same module again does not really work (as Alex Martelli points out). Once loaded, modules are not re-initialized unless done so explicitly with reload.

Monkey patching works because classes are modifiable in python but the mechanism that allows it to spread like this is that once any module has been imported, and initialised, later imports simply add the existing instance to the local namespace without rerunning the initialisation, this also saves time when a module has a lot of initialisation as well as allowing monkey patches.

Importing class from another file in python - I know the fix, but why doesn't the original work?

I can make this code work, but I am still confused why it won't work the first way I tried.
I am practicing python because my thesis is going to be coded in it (doing some cool things with Arduino and PC interfaces). I'm trying to import a class from another file into my main program so that I can create objects. Both files are in the same directory. It's probably easier if you have a look at the code at this point.
#from ArduinoBot import *
#from ArduinoBot import ArduinoBot
import ArduinoBot
# Create ArduinoBot object
bot1 = ArduinoBot()
# Call toString inside bot1 object
bot1.toString()
input("Press enter to end.")
Here is the very basic ArduinoBot class
class ArduinoBot:
def toString(self):
print ("ArduinoBot toString")
Either of the first two commented out import statements will make this work, but not the last one, which to me seems the most intuitive and general. There's not a lot of code for stuff to go wrong here, it's a bit frustrating to be hitting these kind of finicky language specific quirks when I had heard some many good things about Python. Anyway I must be doing something wrong, but why doesn't the simple 'import ClassName' or 'import FileName' work?
Thank you for your help.

consider a file (example.py):
class foo(object):
pass
class bar(object):
pass
class example(object):
pass
Now in your main program, if you do:
import example
what should be imported from the file example.py? Just the class example? should the class foo come along too? The meaning would be too ambiguous if import module pulled the whole module's namespace directly into your current namespace.
The idea is that namespaces are wonderful. They let you know where the class/function/data came from. They also let you group related things together (or equivalently, they help you keep unrelated things separate!). A module sets up a namespace and you tell python exactly how you want to bring that namespace into the current context (namespace) by the way you use import.
from ... import * says -- bring everything in that module directly into my namespace.
from ... import ... as ... says, bring only the thing that I specify directly into my namespace, but give it a new name here.
Finally, import ... simply says bring that module into the current namespace, but keep it separate. This is the most common form in production code because of (at least) 2 reasons.
It prevents name clashes. You can have a local class named foo which won't conflict with the foo in example.py -- You get access to that via example.foo
It makes it easy to trace down which module a class came from for debugging.
consider:
from foo import *
from bar import *
a = AClass() #did this come from foo? bar? ... Hmmm...
In this case, to get access to the class example from example.py, you could also do:
import example
example_instance = example.example()
but you can also get foo:
foo_instance = example.foo()

The simple answer is that modules are things in Python. A module has its own status as a container for classes, functions, and other objects. When you do import ArduinoBot, you import the module. If you want things in that module -- classes, functions, etc. -- you have to explicitly say that you want them. You can either import them directly with from ArduinoBot import ..., or access them via the module with import ArduinoBot and then ArduinoBot.ArduinoBot.
Instead of working against this, you should leverage the container-ness of modules to allow you to group related stuff into a module. It may seem annoying when you only have one class in a file, but when you start putting multiple classes and functions in one file, you'll see that you don't actually want all that stuff being automatically imported when you do import module, because then everything from all modules would conflict with other things. The modules serve a useful function in separating different functionality.
For your example, the question you should ask yourself is: if the code is so simple and compact, why didn't you put it all in one file?

Import doesn't work quite the you think it does. It does work the way it is documented to work, so there's a very simple remedy for your problem, but nonetheless:
import ArduinoBot
This looks for a module (or package) on the import path, executes the module's code in a new namespace, and then binds the module object itself to the name ArduinoBot in the current namespace. This means a module global variable named ArduinoBot in the ArduinoBot module would now be accessible in the importing namespace as ArduinoBot.ArduinoBot.
from ArduinoBot import ArduinoBot
This loads and executes the module as above, but does not bind the module object to the name ArduinoBot. Instead, it looks for a module global variable ArduinoBot within the module, and binds whatever object that referred to the name ArduinoBot in the current namespace.
from ArduinoBot import *
Similarly to the above, this loads and executes a module without binding the module object to any name in the current namespace. It then looks for all module global variables, and binds them all to the same name in the current namespace.
This last form is very convenient for interactive work in the python shell, but generally considered bad style in actual development, because it's not clear what names it actually binds. Considering it imports everything global in the imported module, including any names that it imported at global scope, it very quickly becomes extremely difficult to know what names are in scope or where they came from if you use this style pervasively.

The module itself is an object. The last approach does in fact work, if you access your class as a member of the module. Either if the following will work, and either may be appropriate, depending on what else you need from the imported items:
from my_module import MyClass
foo = MyClass()
or
import my_module
foo = my_module.MyClass()
As mentioned in the comments, your module and class usually don't have the same name in python. That's more a Java thing, and can sometimes lead to a little confusion here.

Why bother to limit the types imported from a python package?

When using many IDEs that support autocompletion with Python, things like this will show warnings, which I find annoying:
from eventlet.green.httplib import BadStatusLine
When switching to:
from eventlet.green.httplib import *
The warnings go away. What's the benefit to limiting imports to a specific set of types you'll use? Is the parsing faster? Reduces collisions? What other point is there? It seems the state of python IDEs and the nature of the typing system makes it hard for many IDEs to fully get right when a type import works and when it doesn't.

By typing from foo import *, you import all the names defined in foo into the global namespace. This is bad practice because you could have name clashes both with other modules and with built-ins.
For example, consider a module foo
#foo.py
def open(something):
pass
and a module bar:
#bar.py
def open(something_else):
pass
Now, from foo import * hides the built-in function open() which means that any calls to open() now refer to foo.open() rather than the built-in. Worse, if you then have from bar import *, the function open() in bar now hides both the built-in and the function imported from foo.
In the example above, from foo import open is equally shadowing the built-in function, but one glance at the code tells you why you can't open files for IO anymore.
This is why you should import only specific names, ensuring that you know what names are imported. Alternatively, you could use fully qualified names (import foo; foo.open(), which is perfectly safe).
EDIT: Just as a note, this can be horribly compounded if the module you're importing also uses from x import *. In this case, not only do you typically import all the stuff in the module foo, but also all the stuff in the module x into the global namespace. This can very quickly turn into an absolute mess.

It reduces collisions with user-defined types, it reduces coupling and it's self-documenting, since it makes clear from the outset of the module which classes are coming from libraries (so the rest must be user-defined). The parsing is not faster, at least not in CPython: an imported module must be read in its entirety to look for the classes/functions being imported.
(I must admit that I never use an IDE.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extend built-in classes in python - python

Related

Defining a module from within a module [duplicate]

Private functions in python

What is the mechanism that allows Python monkey patching in this instance?

Importing class from another file in python - I know the fix, but why doesn't the original work?

Why bother to limit the types imported from a python package?

Categories

Resources