Why aren't "raise" and "del" functions in Python?

Why aren't "raise" and "del" functions in Python? - python

One of the basic changes from Python 2 to Python 3 was making print a function - which, to me, makes perfect sense given its structure. Why aren't the raise and del statements also functions? Especially in the case of raise it seems like it is taking an argument and doing something with it, just like a function does.

raise and del are definitely distinct from functions, each for different reasons:
raise exits the current flow of execution; the normal flow of byte-code interpretation is interrupted and the stack is unwound until the next exception handler is found. Functions can't do this, they create a new stack frame instead.
del can't be a function, because you must specify a specific target; you can't use just any expression, and what is deleted depends on the syntax given; if you use subscription, then deletion takes place for a given element in a container, or a name is removed from the current namespace. The right namespace to delete to is also dependent on the scope of the name deleted. See the del statement grammar definition:
del_stmt ::= "del" target_list
A function can't remove items from a parent namespace, nor can they distinguish between the result of a subscription expression or a direct reference. You pass objects to the function, but to a del statement you pass a name and a context (perhaps by the interpreter when deleting a local or global name).
print on the other hand, requires no special relationship with the current namespace or stack frame, and needs no special syntax constraints to do it's work. It is purely functionality at the application level. The global sys.stdout reference can be accessed by functions just as much as by the interpreter. As such it didn't need to be a statement, and by moving it to a function, additional benefits were made available, such as being able to override it's behaviour and to innovate on it quicker across Python releases.
Do note that part of the raise statement was moved to application-level code instead; in Python 2 you can attach a traceback to the raised exception with:
raise ExceptionClass, exception_value, traceback_object
In Python 3, attaching a traceback to an exception has been moved to the exception itself:
raise Exception("foo occurred").with_traceback(tracebackobj)

https://www.python.org/dev/peps/pep-3105/ has a list of rationals why print is made function. Of the five reasons, (IMO) the most relevant one is:
print is the only application-level functionality that has a statement dedicated to it.
As explained by Alex Martelli here https://stackoverflow.com/a/1054062:
Python statements are things the Python compiler must be specifically aware of -- they may alter the binding of names, may alter control flow, and/or may need to be entirely removed from the generated bytecode in certain conditions (the latter applies to assert). print was the only exception to this assertion in Python 2; by removing it from the roster of statements, Python 3 removes an exception, makes the general assertion "just hold", and therefore is a more regular language.
del and raise obviously alter the binding of names/alter the control flow, thus they both are okay.

Related

Declaring a variable in an if statement, a Python anti pattern?

We discussed in my job about the following piece of Python code (maybe an anti-pattern):
if conditional_variable_:
a = "Some value"
print a
Supose conditional_variable was defined but a variable didn't.
The question is about using a variable without declaring it. The variable a is created inside a piece of code that maybe never will be executed but it is used.
Maybe that fix may repair the anti-pattern:
a = "default value"
if conditional_variable:
a = "changed_value"
print a
In that case, a variable was defined before use it. Consider print a like a ussage of the a variable.

It is not an anti-pattern. It is a bug.
Python has no 'declarations', only binding operations; a name is either bound, or it is not. Trying to access a name that hasn't been bound to yet results in an exception.
Unless your code specifically handles the exception and expected it, running into a NameError or UnboundLocalError exception should be considered a bug.
In other words, code that tries to reference a name should always be subject to the same conditions that bind the name, or be prepared to handle the exception that'll be raised if those conditions don't always hold. Giving your variable a default value outside the if statement means it is bound under all circumstances, so you can also reference it always.

How to continue a frame execution from last attempted instruction after handling an exception?

I would like to handle a NameError exception by injecting the desired missing variable into the frame and then continue the execution from last attempted instruction.
The following pseudo-code should illustrate my needs.
def function():
return missing_var
try:
print function()
except NameError:
frame = inspect.trace()[-1][0]
# inject missing variable
frame.f_globals["missing_var"] = ...
# continue frame execution from last attempted instruction
exec frame.f_code from frame.f_lasti
Read the whole unittest on repl.it
Notes
As pointed out by ivan_pozdeev in his answer, this is known as resumption.
After more research, I found Veedrac's answer to the question Resuming program at line number in the context before an exception using a custom sys.excepthook posted by lc2817 very interesting. It relies on Richie Hindle's work.
Background
The code runs in a slave process, which is controlled by a parent. Tasks (functions really) are written in the parent and latter passed to the slave using dill. I expect some tasks (running in the slave process) to try to access variables from outer scopes in the parent and I'd like the slave to request those variables to the parent on the fly.
p.s.: I don't expect this magic to run in a production environment.

On the contrary to what various commenters are saying, "resume-on-error" exception handling is possible in Python. The library fuckit.py implements said strategy. It steamrollers errors by rewriting the source code of your module at import time, inserting try...except blocks around every statement and swallowing all exceptions. So perhaps you could try a similar sort of tactic?
It goes without saying: that library is intended as a joke. Don't ever use it in production code.
You mentioned that your use case is to trap references to missing names. Have you thought about using metaprogramming to run your code in the context of a "smart" namespace such as a defaultdict? (This is perhaps only marginally less of a bad idea than fuckit.py.)
from collections import defaultdict
class NoMissingNamesMeta(type):
#classmethod
def __prepare__(meta, name, bases):
return defaultdict(lambda: "foo")
class MyClass(metaclass=NoMissingNamesMeta):
x = y + "bar" # y doesn't exist
>>> MyClass.x
'foobar'
NoMissingNamesMeta is a metaclass - a language construct for customising the behaviour of the class statement. Here we're using the __prepare__ method to customise the dictionary which will be used as the class's namespace during creation of the class. Thus, because we're using a defaultdict instead of a regular dictionary, a class whose metaclass is NoMissingNamesMeta will never get a NameError. Any names referred to during the creation of the class will be auto-initialised to "foo".
This approach is similar to #AndréFratelli's idea of manually requesting the lazily-initialised data from a Scope object. In production I'd do that, not this. The metaclass version requires less typing to write the client code, but at the expense of a lot more magic. (Imagine yourself debugging this code in two years, trying to understand why non-existent variables are dynamically being brought into scope!)

The "resumption" exception handling technique has proven to be problematic, that's why it's missing from C++ and later languages.
Your best bet is to use a while loop to not resume where the exception was thrown but rather repeat from a predetermined place:
while True:
try:
do_something()
except NameError as e:
handle_error()
else:
break

You really can't unwind the stack after an exception is thrown, so you'd have to deal with the issue before hand. If your requirement is to generate these variables on the fly (which wouldn't be recommended, but you seem to understand that), then you'd have to actually request them. You can implement a mechanism for that (such as having a global custom Scope class instance and overriding __getitem__, or using something like the __dir__ function), but not as you are asking for it.

What is a "runtime context"?

(Edited for even more clarity)
I'm reading the Python book (Python Essential Reference by Beazley) and he says:
The with statement allows a series of statements to execute inside a
runtime context that is controlled by an object that serves as a context manager.
Here is an example:
with open("debuglog","a") as f:
f.write("Debugging\n")
statements
f.write("Done\n")
He goes on to say:
The with obj statement accepts an optional as var specifier. If given, the value
returned by obj._ enter _() is placed into var. It is important to emphasize
that obj is not necessarily the value assigned to var.
I understand the mechanics of what a 'with' keyword does: a file-object is returned by open and that object is accessible via f within the body of the block. I also understand that enter() and eventually exit() will be called.
But what exactly is a run-time context? A few low level details would be nice - or, an example in C. Could someone clarify what exactly a "context" is and how it might relate to other languages (C, C++). My understanding of a context was the environment eg: a Bash shell executes ls in the context of all the (env displayed) shell variables.
With the with keyword - yes f is accessible to the body of the block but isn't that just scoping? eg: for x in y: here x is not scoped within the block and retains it's value outside the block - is this what Beazley means when he talks about 'runtime context', that f is scoped only within the block and looses all significance outside the with-block?? Why does he say that the statements "execute inside a runtime context"??? Is this like an "eval"??
I understand that open returns an object that is "not ... assigned to var"??
Why isn't it assigned to var? What does Beazley mean by making a statement like that?

The with statement was introduced in PEP 343. This PEP also introduced a new term, "context manager", and defined what that term means.
Briefly, a "context manager" is an object that has special method functions .__enter__() and .__exit__(). The with statement guarantees that the .__enter__() method will be called to set up the block of code indented under the with statement, and also guarantees that the .__exit__() method function will be called at the time of exit from the block of code (no matter how the block is exited; for example, if the code raises an exception, .__exit__() will still be called).
http://www.python.org/dev/peps/pep-0343/
http://docs.python.org/2/reference/datamodel.html?highlight=context%20manager#with-statement-context-managers
The with statement is now the preferred way to handle any task that has a well-defined setup and teardown. Working with a file, for example:
with open(file_name) as f:
# do something with file
You know the file will be properly closed when you are done.
Another great example is a resource lock:
with acquire_lock(my_lock):
# do something
You know the code won't run until you get the lock, and as soon as the code is done the lock will be released. I don't often do multithreaded coding in Python, but when I did, this statement made sure that the lock was always released, even in the face of an exception.
P.S. I did a Google search online for examples of context managers and I found this nifty one: a context manager that executes a Python block in a specific directory.
http://ralsina.me/weblog/posts/BB963.html
EDIT:
The runtime context is the environment that is set up by the call to .__enter__() and torn down by the call to .__exit__(). In my example of acquiring a lock, the block of code runs in the context of having a lock available. In the example of reading a file, the block of code runs in the context of the file being open.
There isn't any secret magic inside Python for this. There is no special scoping, no internal stack, and nothing special in the parser. You simply write two method functions, .__enter__() and .__exit__() and Python calls them at specific points for your with statement.
Look again at this section from the PEP:
Remember, PEP 310 proposes roughly this syntax (the "VAR =" part is optional):
with VAR = EXPR:
BLOCK
which roughly translates into this:
VAR = EXPR
VAR.__enter__()
try:
BLOCK
finally:
VAR.__exit__()
In both examples, BLOCK is a block of code that runs in a specific runtime context that is set up by the call to VAR.__enter__() and torn down by VAR.__exit__().
There are two main benefits to the with statement and the way it is all set up.
The more concrete benefit is that it's "syntactic sugar". I would much rather write a two-line with statement than a six-line sequence of statements; it's easier two write the shorter one, it looks nicer and is easier to understand, and it is easier to get right. Six lines versus two means more chances to screw things up. (And before the with statement, I was usually sloppy about wrapping file I/O in a try block; I only did it sometimes. Now I always use with and always get the exception handling.)
The more abstract benefit is that this gives us a new way to think about designing our programs. Raymond Hettinger, in a talk at PyCon 2013, put it this way: when we are writing programs we look for common parts that we can factor out into functions. If we have code like this:
A
B
C
D
E
F
B
C
D
G
we can easily make a function:
def BCD():
B
C
D
A
BCD()
E
F
BCD()
G
But we have never had a really clean way to do this with setup/teardown. When we have a lot of code like this:
A
BCD()
E
A
XYZ()
E
A
PDQ()
E
Now we can define a context manager and rewrite the above:
with contextA:
BCD()
with contextA:
XYZ()
with contextA:
PDQ()
So now we can think about our programs and look for setup/teardown that can be abstracted into a "context manager". Raymond Hettinger showed several new "context manager" recipes he had invented (and I'm racking my brain trying to remember an example or two for you).
EDIT: Okay, I just remembered one. Raymond Hettinger showed a recipe, that will be built in to Python 3.4, for using a with statement to ignore an exception within a block. See it here: https://stackoverflow.com/a/15566001/166949
P.S. I've done my best to give the sense of what he was saying... if I have made any mistake or misstated anything, it's on me and not on him. (And he posts on StackOverflow sometimes so he might just see this and correct me if I've messed anything up.)
EDIT: You've updated the question with more text. I'll answer it specifically as well.
is this what Beazley means when he talks about 'runtime context', that f is scoped only within the block and looses all significance outside the with-block?? Why does he say that the statements "execute inside a runtime context"??? Is this like an "eval"??
Actually, f is not scoped only within the block. When you bind a name using the as keyword in a with statement, the name remains bound after the block.
The "runtime context" is an informal concept and it means "the state set up by the .__enter__() method function call and torn down by the .__exit__() method function call." Again, I think the best example is the one about getting a lock before the code runs. The block of code runs in the "context" of having the lock.
I understand that open returns an object that is "not ... assigned to var"?? Why isn't it assigned to var? What does Beazley mean by making a statement like that?
Okay, suppose we have an object, let's call it k. k implements a "context manager", which means that it has method functions k.__enter__() and k.__exit__(). Now we do this:
with k as x:
# do something
What David Beazley wants you to know is that x will not necessarily be bound to k. x will be bound to whatever k.__enter__() returns. k.__enter__() is free to return a reference to k itself, but is also free to return something else. In this case:
with open(some_file) as f:
# do something
The call to open() returns an open file object, which works as a context manager, and its .__enter__() method function really does just return a reference to itself.
I think most context managers return a reference to self. Since it's an object it can have any number of member variables, so it can return any number of values in a convenient way. But it isn't required.
For example, there could be a context manager that starts a daemon running in the .__enter__() function, and returns the process ID number of the daemon from the .__enter__() function. Then the .__exit__() function would shut down the daemon. Usage:
with start_daemon("parrot") as pid:
print("Parrot daemon running as PID {}".format(pid))
daemon = lookup_daemon_by_pid(pid)
daemon.send_message("test")
But you could just as well return the context manager object itself with any values you need tucked inside:
with start_daemon("parrot") as daemon:
print("Parrot daemon running as PID {}".format(daemon.pid))
daemon.send_message("test")
If we need the PID of the daemon, we can just put it in a .pid member of the object. And later if we need something else we can just tuck that in there as well.

The with context takes care that on entry, the __enter__ method is called and the given var is set to whatever __enter__ returns.
In most cases, that is the object which is worked on previously - in the file case, it is - but e.g. on a database, not the connection object, but a cursor object is returned.
The file example can be extended like this:
f1 = open("debuglog","a")
with f1 as f2:
print f1 is f2
which will print True as here, the file object is returned by __enter__. (From its point of view, self.)
A database works like
d = connect(...)
with d as c:
print d is c # False
print d, c
Here, d and c are completely different: d is the connection to the database, c is a cursor used for one transaction.
The with clause is terminated by a call to __exit__() which is given the state of execution of the clause - either success or failure. In this case, the __exit__() method can act appropriately.
In the file example, the file is closed no matter if there was an error or not.
In the database example, normally the transaction is committed on success and rolled back on failure.
The context manager is for easy initialisation and cleanup of things like exactly these - files, databases etc.
There is no direct correspondence in C or C++ that I am aware of.
C knows no concept of exception, so none can be caught in a __exit__(). C++ knows exceptions, and there seems to be ways to do soo (look below at the comments).

exec code in original local scope of frame

I've written a remote Python debugger and one of the features I need is to execute arbitrary code while stopped at a breakpoint. My debugger uses the following to execute code received from the remote debugger:
exec (compile(code, '<string>', 'single') , frame.f_globals, frame.f_locals)
This works fine for the most part, but I've noticed a couple issues.
Assignment statements aren't actually applied to the original locals dictionary. This is probably due to the fact that f_locals is supposed to be read-only.
If stopped within a class method, accessing protected attributes (names beginning with double underscore) does not work. I'm assuming this is due to the name mangling that Python performs on protected attributes.
So my question is, is there a way around these limitations? Can I trick Python into thinking that the code is being executed in the actual local scope of that frame?
I'm using CPython 2.7, and I'm willing to accept a solution/hack specific to this version.

Assignment statements aren't actually
applied to the original locals
dictionary. This is probably due to
the fact that f_locals is supposed to
be read-only.
Not exactly, but the bytecode for the function will not look at locals, using rather a simple but crucial optimization whereby local variables are in a simple array, avoiding runtime lookups. The only way to avoid this (and make the function much, much slower) is compiling different code, e.g. code starting with an exec '' to force the compiler to avoid the optimization (in Python 2; no way, in Python 3). If you need to work with existing bytecode, you're out of luck: there is no way to accomplish what you desire.
If stopped within a class method,
accessing protected attributes (names
beginning with double underscore) does
not work. I'm assuming this is due to
the name mangling that Python performs
on protected attributes.
Yep, so this issue does allow a workaround: prepend _Classname to the name to mimic what the compiler does. Note that double-underscore prefixes means private: protected would be a single underscore (and would give you no trouble). Private names are specifically meant to avoid accidental classes with names bound in subclasses (and work decently for that one purpose, though not perfectly, and not for anything else;-).

I'm not sure I've understood you correctly, but exec does populate the locals parameter with assignments inside the code:
>>> loc = {}
>>> exec(compile('a=3', '<string>', 'single'), {}, loc)
>>> loc
{'a': 3}
Perhaps f_locals doesn't allow writes.

to execute arbitrary code while stopped at a breakpoint ... Can I trick Python into thinking that the code is being executed in the actual local scope of that frame?
The Python debugger, pdb, allows this. For example, let's say you are debugging the file tests/scopeTest.py, and you have the following line in your program, where the variable hasn't been declared in the program itself :
print (NOT_DEFINED_IN_PROGRAM)
so that running the code python tests/scopeTest.py would result in :
NameError: name 'NOT_DEFINED_IN_PROGRAM' is not defined
Now you would like to define that variable when stopped at that line in the debugger, and have the program continue executing, using that variable as if it had been defined in the program all along. In other words, you would like to effect the change within that scope, so that you can continue execution with that change permanent. It is actually possible :
$ python -m pdb tests/scopeTest.py
> /home/user/tests/scopeTest.py(1)<module>()
-> print (NOT_DEFINED_IN_PROGRAM)
(Pdb) 'NOT_DEFINED_IN_PROGRAM' in locals()
False
(Pdb) NOT_DEFINED_IN_PROGRAM = 5
(Pdb) 'NOT_DEFINED_IN_PROGRAM' in locals()
True
(Pdb) step
5
Pdb does this through a compile and exec in its default function, which does the equivalent of :
code = compile(line + '\n', <stdin>, 'single')
exec(code, self.curframe.f_globals, self.curframe_locals)
where self.curframe is a specific frame. Now, self.curframe_locals is not self.curframe.f_locals, because, as the setup function says :
# The f_locals dictionary is updated from the actual frame
# locals whenever the .f_locals accessor is called, so we
# cache it here to ensure that modifications are not overwritten.
self.curframe_locals = self.curframe.f_locals
Hope that helps, and is what you meant!
Take note that, even then, should you want to, for example, replace a function in the context of the program being debugged with a monkey-patched version, such as:
newGlobals['abs'] = myCustomAbsFunction
exec(code, newGlobals, locals)
the scope of the myCustomAbsFunction is not going to be the user program, but is going to be the context of where that function was defined, which is the debugger! There is a way around that too, but as it wasn't specifically asked, it is left as an exercise for the reader, for now. ^__^

Calling a hook function every time an Exception is raised

Let's say I want to be able to log to file every time any exception is raised, anywhere in my program. I don't want to modify any existing code.
Of course, this could be generalized to being able to insert a hook every time an exception is raised.
Would the following code be considered safe for doing such a thing?
class MyException(Exception):
def my_hook(self):
print('---> my_hook() was called');
def __init__(self, *args, **kwargs):
global BackupException;
self.my_hook();
return BackupException.__init__(self, *args, **kwargs);
def main():
global BackupException;
global Exception;
BackupException = Exception;
Exception = MyException;
raise Exception('Contrived Exception');
if __name__ == '__main__':
main();

If you want to log uncaught exceptions, just use sys.excepthook.
I'm not sure I see the value of logging all raised exceptions, since lots of libraries will raise/catch exceptions internally for things you probably won't care about.

Your code as far as I can tell would not work.
__init__ has to return None and you are trying to return an instance of backup exception. In general if you would like to change what instance is returned when instantiating a class you should override __new__.
Unfortunately you can't change any of the attributes on the Exception class. If that was an option you could have changed Exception.__new__ and placed your hook there.
the "global Exception" trick will only work for code in the current module. Exception is a builtin and if you really want to change it globally you need to import __builtin__; __builtin__.Exception = MyException
Even if you changed __builtin__.Exception it will only affect future uses of Exception, subclasses that have already been defined will use the original Exception class and will be unaffected by your changes. You could loop over Exception.__subclasses__ and change the __bases__ for each one of them to insert your Exception subclass there.
There are subclasses of Exception that are also built-in types that you also cannot modify, although I'm not sure you would want to hook any of them (think StopIterration).
I think that the only decent way to do what you want is to patch the Python sources.

This code will not affect any exception classes that were created before the start of main, and most of the exceptions that happen will be of such kinds (KeyError, AttributeError, and so forth). And you can't really affect those "built-in exceptions" in the most important sense -- if anywhere in your code is e.g. a 1/0, the real ZeroDivisionError will be raised (by Python's own internals), not whatever else you may have bound to that exceptions' name.
So, I don't think your code can do what you want (despite all the semicolons, it's still supposed to be Python, right?) -- it could be done by patching the C sources for the Python runtime, essentially (e.g. by providing a hook potentially caught on any exception even if it's later caught) -- such a hook currently does not exist because the use cases for it would be pretty rare (for example, a StopIteration is always raised at the normal end of every for loop -- and caught, too; why on Earth would one want to trace that, and the many other routine uses of caught exceptions in the Python internals and standard library?!).

Download pypy and instrument it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.