I would like, given a python module, to monkey patch all functions, classes and attributes it defines. Simply put, I would like to log every interaction a script I do not directly control has with a module I do not directly control. I'm looking for an elegant solution that will not require prior knowledge of either the module or the code using it.
I found several high-level tools that help wrapping, decorating, patching etc... and i've went over the code of some of them, but I cannot find an elegant solution to create a proxy of any given module and automatically proxy it, as seamlessly as possible, except for appending logic to every interaction (record input arguments and return value, for example).
in case someone else is looking for a more complete proxy implementation
Although there are several python proxy solutions similar to those OP is looking for, I could not find a solution that will also proxy classes and arbitrary class objects, as well as automatically proxy functions return values and arguments. Which is what I needed.
I've got some code written for that purpose as part of a full proxying/logging python execution and I may make it into a separate library in the future. If anyone's interested you can find the core of the code in a pull request. Drop me a line if you'd like this as a standalone library.
My code will automatically return wrapper/proxy objects for any proxied object's attributes, functions and classes. The purpose is to log and replay some of the code, so i've got the equivalent "replay" code and some logic to store all proxied objects to a json file.
Related
I'm working on a project to do some static analysis of Python code. We're hoping to encode certain conventions that go beyond questions of style or detecting code duplication. I'm not sure this question is specific enough, but I'm going to post it anyway.
A few of the ideas that I have involve being able to build a certain understanding of how the various parts of source code work so we can impose these checks. For example, in part of our application that's exposing a REST API, I'd like to validate something like the fact that if a route is defined as a GET, then arguments to the API are passed as URL arguments rather than in the request body.
I'm able to get something like that to work by pulling all the routes, which are pretty nicely structured, and there are guarantees of consistency given the route has to be created as a route object. But once I know that, say, a given route is a GET, figuring out how the handler function uses arguments requires some degree of interpretation of the function source code.
Naïvely, something like inspect.getsourcelines will allow me to get the source code, but on further examination that's not the best solution because I immediately have to build interpreter-like features, such as figuring out whether a line is a comment, and then do something like use regular expressions to hunt down places where state is moved from the request context to a local variable.
Looking at tools like PyLint, they seem mostly focused on high-level "universals" of static analysis, and (at least on superficial inspection) don't have obvious ways of extracting this sort of understanding at a lower level.
Is there a more systematic way to get this representation of the source code, either with something in the standard library or with another tool? Or is the only way to do this writing a mini-interpreter that serves my purposes?
In my script I'm having to run several functions within a class by using if __name__ == "__main__" but I am unable to adjust the functions within the class due to other people needing to use it for other purposes. The class expects a number of objects to be passed in (these objects are essentially empty arrays that can be filled with certain commands). If I pass in None instead, what will happen when the class functions try to perform operations on them? I presume it will crash because the object's functions will no longer be defined if there is no object. However, is there perhaps a way to ignore these commands by doing something outside of the class? I know try-except would probably work, but I'm trying to avoid making any edits to the class, if at all possible.
As far as I understood the question, you want to pass some mock objects to third-party scripts. Consider using mock library for python 2.x (which is available as for unittest.mock for python >= 3.3) to instantiate objects the objects that functions expect for.
NB But please, if the code in question is written by your colleagues do discuss with them some changes in their code that would simplify both your work and the clarity of the logic and will keep the code safe from mockups.
To ask my very specific question I find I need quite a long introduction to motivate and explain it -- I promise there's a proper question at the end!
While reading part of a large Python codebase, sometimes one comes across code where the interface required of an argument is not obvious from "nearby" code in the same module or package. As an example:
def make_factory(schema):
entity = schema.get_entity()
...
There might be many "schemas" and "factories" that the code deals with, and "def get_entity()" might be quite common too (or perhaps the function doesn't call any methods on schema, but just passes it to another function). So a quick grep isn't always helpful to find out more about what "schema" is (and the same goes for the return type). Though "duck typing" is a nice feature of Python, sometimes the uncertainty in a reader's mind about the interface of arguments passed in as the "schema" gets in the way of quickly understanding the code (and the same goes for uncertainty about typical concrete classes that implement the interface). Looking at the automated tests can help, but explicit documentation can be better because it's quicker to read. Any such documentation is best when it can itself be tested so that it doesn't get out of date.
Doctests are one possible approach to solving this problem, but that's not what this question is about.
Python 3 has a "parameter annotations" feature (part of the function annotations feature, defined in PEP 3107). The uses to which that feature might be put aren't defined by the language, but it can be used for this purpose. That might look like this:
def make_factory(schema: "xml_schema"):
...
Here, "xml_schema" identifies a Python interface that the argument passed to this function should support. Elsewhere there would be code that defines that interface in terms of attributes, methods & their argument signatures, etc. and code that allows introspection to verify whether particular objects provide an interface (perhaps implemented using something like zope.interface / zope.schema). Note that this doesn't necessarily mean that the interface gets checked every time an argument is passed, nor that static analysis is done. Rather, the motivation of defining the interface is to provide ways to write automated tests that verify that this documentation isn't out of date (they might be fairly generic tests so that you don't have to write a new test for each function that uses the parameters, or you might turn on run-time interface checking but only when you run your unit tests). You can go further and annotate the interface of the return value, which I won't illustrate.
So, the question:
I want to do exactly that, but using Python 2 instead of Python 3. Python 2 doesn't have the function annotations feature. What's the "closest thing" in Python 2? Clearly there is more than one way to do it, but I suspect there is one (relatively) obvious way to do it.
For extra points: name a library that implements the one obvious way.
Take a look at plac that uses annotations to define a command-line interface for a script. On Python 2.x it uses plac.annotations() decorator.
The closest thing is, I believe, an annotation library called PyAnno.
From the project webpage:
"The Pyanno annotations have two functions:
Provide a structured way to document Python code
Perform limited run-time checking "
I am attempting to re-organize our test libraries for automation and nose seems really promising. My question is, what is the best strategy for passing Python objects into nose tests?
Our tests are organized in a testlib with a bunch of modules that exercise different types of request operations. Something like this:
testlib
\-testmoda
\-testmodb
\-testmodc
In some cases the test modules (i.e. testmoda) is nothing but test_something(), test_something2() functions while in some cases we have a TestModB class in testmob with the test_anotherthing1(), test_anotherthing2() functions. The cool thing is that nose easily finds both.
Most of those test functions are request factory stuff that can easily share a single connection to our server farm. Thus we do a lot of test_something1(cnn), TestModB.test_anotherthing2(cnn), etc.
Currently we don't use nose, instead we have a hodge-podge of homegrown driver scripts with hard-coded lists of tests to execute. Each of those driver scripts creates its own connection object. Maintaining those scripts and the connection minutia is painful.
I'd like to take free advantage of nose's beautiful discovery functionality, passing in a connection object of my choosing.
Thanks in advance!
Rob
P.S. The connection objects are not pickle-able. :(
Could you use a factory create the connections, then have the functions test_something1() (taking no arguments) use the factory to get a connection?
As far as I can tell, there is no easy way to simply pass custom objects to Nose.
However, as Matt pointed out there are some viable workarounds to achieve similar results.
Basically, do this:
Setup a data dictionary as a package level global
Add custom objects to that dictionary
Create some factory functions to return those custom objects or create new ones if they're present/suitable
Refactor the existing testlib\testmod* modules to use the factory
In Java, this question is easy (if a little tedious) - every class requires its own file. So the number of .java files in a project is the number of classes (not counting anonymous/nested classes).
In Python, though, I can define multiple classes in the same file, and I'm not quite sure how to find the point at which I split things up. It seems wrong to make a file for every class, but it also feels wrong just to leave everything in the same file by default. How do I know where to break a program up?
Remember that in Python, a file is a module that you will most likely import in order to use the classes contained therein. Also remember one of the basic principles of software development "the unit of packaging is the unit of reuse", which basically means:
If classes are most likely used together, or if using one class leads to using another, they belong in a common package.
As I see it, this is really a question about reuse and abstraction. If you have a problem that you can solve in a very general way, so that the resulting code would be useful in many other programs, put it in its own module.
For example: a while ago I wrote a (bad) mpd client. I wanted to make configuration file and option parsing easy, so I created a class that combined ConfigParser and optparse functionality in a way I thought was sensible. It needed a couple of support classes, so I put them all together in a module. I never use the client, but I've reused the configuration module in other projects.
EDIT: Also, a more cynical answer just occurred to me: if you can only solve a problem in a really ugly way, hide the ugliness in a module. :)
In Java ... every class requires its own file.
On the flipside, sometimes a Java file, also, will include enums or subclasses or interfaces, within the main class because they are "closely related."
not counting anonymous/nested classes
Anonymous classes shouldn't be counted, but I think tasteful use of nested classes is a choice much like the one you're asking about Python.
(Occasionally a Java file will have two classes, not nested, which is allowed, but yuck don't do it.)
Python actually gives you the choice to package your code in the way you see fit.
The analogy between Python and Java is that a file i.e., the .py file in Python is
equivalent to a package in Java as in it can contain many related classes and functions.
For good examples, have a look in the Python built-in modules.
Just download the source and check them out, the rule of thumb I follow is
when you have very tightly coupled classes or functions you keep them in a single file
else you break them up.