How can I inject code in the beginning of a function? - python

I would like to be able to inject a piece of code in the beginning of a given function.
Research mostly mentions using decorators, but for my specific use case I wouldn't want to wrap the modified function with an additional function call.
I am also unable to add a parameter to the function - I have an already compiled function at runtime and that's it.
The reason I wouldn't want to use a wrapper, is because I'd like to write a utility library which allows the programmer to "paste" a piece of code at the beginning of an already written function, without adding another level to the call stack at all. Mainly for performance reasons.
How can this be done? And would it work across Python versions without breaking?

Premature optimization is the root of all evil. You should not "simply assume" a wrapper function will have a major performance impact. There is no safe, simple, portable, way to do what you're asking. The most applicable solution is a custom metaclass as it allows you to control the creation of new objects based on it.

Related

Is there a way to get the entire C stack trace of a Python code block from Python?

I was trying to figure out the exact control flow and function call to better guide me while writing in cpython for a fairly large and complicated codebase. It feels like this should be easily doable using pdb() but I can't seem to figure it out. Using bp.set_trace() only reveals the Python file called during the execution. Is this the write way of going about it? Since a fair bit of codegen and dynamic dispatch of function calls is used I can't precisely find the C++ method definitions of the functions called from the Python code.
It seems like this should be straightforward but most SO threads don't focus precisely on this, just the code flow sequence
I was wondering if pdb.pm() would give me what I need but it doesn't exactly work unless an exception has occurred.

monkey-patching/decorating/wrapping an entire python module

I would like, given a python module, to monkey patch all functions, classes and attributes it defines. Simply put, I would like to log every interaction a script I do not directly control has with a module I do not directly control. I'm looking for an elegant solution that will not require prior knowledge of either the module or the code using it.
I found several high-level tools that help wrapping, decorating, patching etc... and i've went over the code of some of them, but I cannot find an elegant solution to create a proxy of any given module and automatically proxy it, as seamlessly as possible, except for appending logic to every interaction (record input arguments and return value, for example).
in case someone else is looking for a more complete proxy implementation
Although there are several python proxy solutions similar to those OP is looking for, I could not find a solution that will also proxy classes and arbitrary class objects, as well as automatically proxy functions return values and arguments. Which is what I needed.
I've got some code written for that purpose as part of a full proxying/logging python execution and I may make it into a separate library in the future. If anyone's interested you can find the core of the code in a pull request. Drop me a line if you'd like this as a standalone library.
My code will automatically return wrapper/proxy objects for any proxied object's attributes, functions and classes. The purpose is to log and replay some of the code, so i've got the equivalent "replay" code and some logic to store all proxied objects to a json file.

Safe implementation strategy for embedded user-defined expressions

I am designing/prototyping a Domain Specific Language... in Python, for now, at least. The design is straightforward - but requiring support to specify an arbitrary function (the domain of which is a map from labels to integers - the range is an integer.) In many cases, the function will merely select a label in the domain to yield a result... but I want to allow the specification of any function that could be easily (and efficiently) implemented in a general purpose programming language.
A caveat is that I want the function to be 'safe'... by this I mean:
A 'pure' function: deterministic with no side effects. (i.e. no external state; no interaction with files, I/O, devices - etc.)
Terminating - either successfully, or after specific (small-scale) allocated computational resources have expired.
I am keen that this function should be implemented efficiently - I expect definitions to be provided infrequently - and evaluated very frequently. I would also like the functions to be defined using a familiar syntax.
I've considered supporting the implementation of functions in python... I'm aware that I could impose restrictions using the eval() function, and I've found the AST module - suggesting an approach involving parsing to an AST, then interpreting (or verifying, prior to evaluation) the AST tree. I've also read about pyparse and consdered implementing a bespoke, interpreted, language.
I can't help think that trying to block undesirable behaviour from eval() is to be tackling the problem "backwards" (trying to block undesirable functionality ex-post) whereas implementing a bespoke language would involve re-inventing the wheel.
Does Python already have a safe, efficient, embeddable, expression interpreter?
PyPy has a sandbox.
If you're running this in the web browser (the usual place for untrusted code concerns) consider running it client-side with something like Brython. No-one cares if the user hacks his own machine.
If you do implement a bespoke interpreter, you don't have to re-implement all of the wheel. It's thought to be relatively safe to use compile() on untrusted code, but beware of large constants eating time and memory. Run the compiler in a separate process you can kill. Then you just need to write a Python bytecode interpreter that lacks access to anything important.

Use of eval in Python, MATLAB, etc [duplicate]

This question already has answers here:
Why is using 'eval' a bad practice?
(8 answers)
Closed 9 years ago.
I do know that one shouldn't use eval. For all the obvious reasons (performance, maintainability, etc.). My question is more on the side – is there a legitimate use for it? Where one should use it rather than implement the code in another way.
Since it is implemented in several languages and can lead to bad programming style, I assume there is a reason why it's still available.
First, here is Mathwork's list of alternatives to eval.
You could also be clever and use eval() in a compiled application to build your mCode interpreter, but the Matlab compiler doesn't allow that for obvious reasons.
One place where I have found a reasonable use of eval is in obtaining small predicates of code that consumers of my software need to be able to supply as part of a parameter file.
For example, there might be an item called "Data" that has a location for reading and writing the data, but also requires some predicate applied to it upon load. In a Yaml file, this might look like:
Data:
Name: CustomerID
ReadLoc: some_server.some_table
WriteLoc: write_server.write_table
Predicate: "lambda x: x[:4]"
Upon loading and parsing the objects from Yaml, I can use eval to turn the predicate string into a callable lambda function. In this case, it implies that CustomerID is a long string and only the first 4 characters are needed in this particular instance.
Yaml offers some clunky ways to magically invoke object constructors (e.g. using something like !Data in my code above, and then having defined a class for Data in the code that appropriately uses Yaml hooks into the constructor). In fact, one of the biggest criticisms I have of the Yaml magic object construction is that it is effectively like making your whole parameter file into one giant eval statement. And this is very problematic if you need to validate things and if you need flexibility in the way multiple parts of the code absorb multiple parts of the parameter file. It also doesn't lend itself easily to templating with Mako, whereas my approach above makes that easy.
I think this simpler design which can be easily parsed with any XML tools is better, and using eval lets me allow the user to pass in whatever arbitrary callable they want.
A couple of notes on why this works in my case:
The users of the code are not Python programmers. They don't have the ability to write their own functions and then just pass a module location, function name, and argument signature (although, putting all that in a parameter file is another way to solve this that wouldn't rely on eval if the consumers can be trusted to write code.)
The users are responsible for their bad lambda functions. I can do some validation that eval works on the passed predicate, and maybe even create some tests on the fly or have a nice failure mode, but at the end of the day I am allowed to tell them that it's their job to supply valid predicates and to ensure the data can be manipulated with simple predicates. If this constraint wasn't in place, I'd have to shuck this for a different system.
The users of these parameter files compose a small group mostly willing to conform to conventions. If that weren't true, it would be risky that folks would hi-jack the predicate field to do many inappropriate things -- and this would be hard to guard against. On big projects, it would not be a great idea.
I don't know if my points apply very generally, but I would say that using eval to add flexibility to a parameter file is good if you can guarantee your users are a small group of convention-upholders (a rare feat, I know).
In MATLAB the eval function is useful when functions make use of the name of the input argument via the inputname function. For example, to overload the builtin display function (which is sensitive to the name of the input argument) the eval function is required. For example, to call the built in display from an overloaded display you would do
function display(X)
eval([inputname(1), ' = X;']);
eval(['builtin(''display'', ', inputname(1), ');']);
end
In MATLAB there is also evalc. From the documentation:
T = evalc(S) is the same as EVAL(S) except that anything that would
normally be written to the command window, except for error messages,
is captured and returned in the character array T (lines in T are
separated by '\n' characters).
If you still consider this eval, then it is very powerful when dealing with closed source code that displays useful information in the command window and you need to capture and parse that output.

How to document and test interfaces required of formal parameters in Python 2?

To ask my very specific question I find I need quite a long introduction to motivate and explain it -- I promise there's a proper question at the end!
While reading part of a large Python codebase, sometimes one comes across code where the interface required of an argument is not obvious from "nearby" code in the same module or package. As an example:
def make_factory(schema):
entity = schema.get_entity()
...
There might be many "schemas" and "factories" that the code deals with, and "def get_entity()" might be quite common too (or perhaps the function doesn't call any methods on schema, but just passes it to another function). So a quick grep isn't always helpful to find out more about what "schema" is (and the same goes for the return type). Though "duck typing" is a nice feature of Python, sometimes the uncertainty in a reader's mind about the interface of arguments passed in as the "schema" gets in the way of quickly understanding the code (and the same goes for uncertainty about typical concrete classes that implement the interface). Looking at the automated tests can help, but explicit documentation can be better because it's quicker to read. Any such documentation is best when it can itself be tested so that it doesn't get out of date.
Doctests are one possible approach to solving this problem, but that's not what this question is about.
Python 3 has a "parameter annotations" feature (part of the function annotations feature, defined in PEP 3107). The uses to which that feature might be put aren't defined by the language, but it can be used for this purpose. That might look like this:
def make_factory(schema: "xml_schema"):
...
Here, "xml_schema" identifies a Python interface that the argument passed to this function should support. Elsewhere there would be code that defines that interface in terms of attributes, methods & their argument signatures, etc. and code that allows introspection to verify whether particular objects provide an interface (perhaps implemented using something like zope.interface / zope.schema). Note that this doesn't necessarily mean that the interface gets checked every time an argument is passed, nor that static analysis is done. Rather, the motivation of defining the interface is to provide ways to write automated tests that verify that this documentation isn't out of date (they might be fairly generic tests so that you don't have to write a new test for each function that uses the parameters, or you might turn on run-time interface checking but only when you run your unit tests). You can go further and annotate the interface of the return value, which I won't illustrate.
So, the question:
I want to do exactly that, but using Python 2 instead of Python 3. Python 2 doesn't have the function annotations feature. What's the "closest thing" in Python 2? Clearly there is more than one way to do it, but I suspect there is one (relatively) obvious way to do it.
For extra points: name a library that implements the one obvious way.
Take a look at plac that uses annotations to define a command-line interface for a script. On Python 2.x it uses plac.annotations() decorator.
The closest thing is, I believe, an annotation library called PyAnno.
From the project webpage:
"The Pyanno annotations have two functions:
Provide a structured way to document Python code
Perform limited run-time checking "

Categories