How to mock pysvn - python

I am working on a Python module that suppose to checkout some code from SVN and build it. After much refactoring of some legacy code, I got a fairly decent coverage of the code, however, I have a gaping hole in the code that uses pysvn.
Admittedly the concept of Mock object is new to me, but after reading some of the documentation of MiniMock and pymox (both are available in my environment), I came to the conclusion that I will need to capture some pysvn output and have it returned in my test code.
But here I find myself (pardon the pun) in a pickle. The objects returned from the pysvn.Client() commands do not behave nicely when I try to pickle them, or even to compare them.
Any suggestion of how to serialize or otherwise mock pysvn or some other non-pythonic behaving objects?
Naturally, I am willing to accept that I am approaching this problem from the wrong direction, or that I am simply an idiot. In that case any advice will be helpful.
Additional information 0:
Some pysvn object can be reduced to a dict by accessing their data property, and can be reproduced by passing this dict into the appropriate __init__()
For example:
>>> svn=pysvn.Client()
>>> svn.list('http://svn/svn/')[0][0]
<PysvnList u'http://svn/svn'>
>>> d=svn.list('http://svn/svn/')[0][0].data
>>> pysvn.PysvnList(d)
<PysvnList u'http://svn/svn'>
However inside this object there might be some unpicklable objects:
>>> cPickle.dumps(d)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
cPickle.UnpickleableError: Cannot pickle <type 'node_kind'> objects
Additional Information 1:
As for #H. Dunlop request, here is a (simplified) snippet of my code,
It allow to get a list out of SVN, and let the user choose an item from that list:
class Menu(object):
"""a well covered class"""
# ...
class VersionControl(object):
"""A poorly covered class"""
def __init__(self):
self.svn = pysvn.Client()
# ...
def list(self, url):
"""svn ls $url"""
return [os.path.basename(x['path']) for (x,_) in self.svn.list(url)[1:]]
def choose(self, choice, url):
"""Displays a menu from svn list, and get's the users choice form it.
Returns the svn item (path).
"""
menu = Menu(prompt="Please choose %s from list:\n" % choice,
items=self.list(url),
muliple_choice=False)
menu.present()
return menu.chosen()

In this answer I used minimock, I'm not actually that familiar with it and would suggest using http://www.voidspace.org.uk/python/mock/ instead. This code would end up a bit cleaner . But you specified minimock or pymox so here goes:
from minimock import TraceTracker, Mock, mock
import unittest
import pysvn
from code_under_test import VersionControl
class TestVersionControl(unittest.TestCase):
def test_init(self):
mock_svn = Mock(name='svn_client')
mock('pysvn.Client', returns=mock_svn)
vc = VersionControl()
self.assertEqual(vc.svn, mock_svn)
def test_list_calls_svn_list_and_returns_urls(self):
tracker = TraceTracker()
test_url = 'a test_url'
mock_data = [
({'path': 'first result excluded'}, None),
({'path': 'url2'}, None),
({'path': 'url3', 'info': 'not in result'}, None),
({'path': 'url4'}, None),
]
vc = VersionControl()
mock('vc.svn.list', returns=mock_data, tracker=tracker)
response = vc.list(test_url)
self.assertEqual(['url2', 'url3', 'url4'], response)
self.assertTrue("Called vc.svn.list('a test_url')" in tracker.dump())
if __name__ == '__main__':
unittest.main()
If you wanted to test more of the underlying dictionary returned by pysvn then you can just modify the list of tuples with dictionaries inside of it that you make it return. You could even write a little bit of code that just dumped out the dictionaries from the pysvn objects .

Have you considered the use of: pickle instead cPicles?
"cPickle module the callables Pickler() and Unpickler() are functions, not classes. This means that you cannot use them to derive custom pickling and unpickling subclasses."

Related

Is it possible to uniformly save any object in a JSON file?

I'm working on a web-server type of application and as part of multi-language communication I need to serialize objects in a JSON file. The issue is that I'd like to create a function which can take any custom object and save it at run time rather than limiting the function to what type of objects it can store based on structure.
Apologies if this question is a duplicate, however from what I have searched the other questions and answers do not seem to tackle the dynamic structure aspect of the problem, thus leading me to open this question.
The function is going to be used to communicate between PHP server code and Python scripts, hence the need for such a solution
I have attempted to use the json.dump(data,outfile) function, however the issue is that I need to convert such objects to a legal data structure first
JSON is a rigidly structured format, and Python's json module, by design, won't try to coerce types it doesn't understand.
Check out this SO answer. While __dict__ might work in some cases, it's often not exactly what you want. One option is to write one or more classes that inherit JSONEncoder and provides a method that turns your type or types into basic types that json.dump can understand.
Another option would be to write a parent class, e.g. JSONSerializable and have these data types inherit it the way you'd use an interface in some other languages. Making it an abstract base class would make sense, but I doubt that's important to your situation. Define a method on your base class, e.g. def dictify(self), and either implement it if it makes sense to have a default behavior or just have it it raise NotImplementedError.
Note that I'm not calling the method serialize, because actual serialization will be handled by json.dump.
class JSONSerializable(ABC):
def dictify(self):
raise NotImplementedError("Missing serialization implementation!")
class YourDataType(JSONSerializable):
def __init__(self):
self.something = None
# etc etc
def dictify(self):
return {"something": self.something}
class YourIncompleteDataType(JSONSerializable):
# No dictify(self) implementation
pass
Example usage:
>>> valid = YourDataType()
>>> valid.something = "really something"
>>> valid.dictify()
{'something': 'really something'}
>>>
>>> invalid = YourIncompleteDataType()
>>> invalid.dictify()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in dictify
NotImplementedError: Missing dictify implementation!
Basically, though: You do need to handle this yourself, possibly on a per-type basis, depending on how different your types are. It's just a matter of what method of formatting your types for serialization is the best for your use case.

Is it possible to add attributes to built in python objects dynamically in Python?

I need to add an attribute (holding a tuple or object) to python objects dynamically. This works for Python classes written by me, but not for built in classes.
Consider the following program:
import numpy as np
class My_Class():
pass
my_obj = My_Class()
my_obj2 = My_Class()
my_obj.__my_hidden_field = (1,1)
my_obj2.__my_hidden_field = (2,1)
print(my_obj.__my_hidden_field, my_obj2.__my_hidden_field)
This correctly prints (1, 1) (2, 1). However the following program doesnt work.
X = np.random.random(size=(2,3))
X.__my_hidden_field = (3,1)
setattr(X, '__my_hidden_field', (3,1))
Both of the above line throws the following error # AttributeError: 'numpy.ndarray' object has no attribute '__my_hidden_field'
Now, the reason found from these questions (i.e., Attribute assignment to built-in object, Can't set attributes of object class, python: dynamically adding attributes to a built-in class) is Python does not allow dynamically adding attributes to built_in objects.
Excerpt from the answer: https://stackoverflow.com/a/22103924/8413477
This is prohibited intentionally to prevent accidental fatal changes to built-in types (fatal to parts of the code that you never though of). Also, it is done to prevent the changes to affect different interpreters residing in the address space, since built-in types (unlike user-defined classes) are shared between all such interpreters.
However, all the answers are quite old, and I am badly in need of doing this for my research project.
There is a module that allows to add methods to built in Class though:
https://pypi.org/project/forbiddenfruit/
However,it doesnt allow adding objects/attributes to each object.
Any help ?
You probably want weakref.WeakKeyDictionary. From the doc,
This can be used to associate additional data with an object owned by other parts of an application without adding attributes to those objects.
Like an attribute, and unlike a plain dict, this allows the objects to get garbage collected when there are no other references to it.
You'd look up the field with
my_hidden_field[X]
instead of
X._my_hidden_field
Two caveats: First, since a weak key may be deleted at any time without warning, you shouldn't iterate over a WeakKeyDictionary. Looking up an object you have a reference to is fine though. And second, you can't make a weakref to an object type written in C that doesn't have a slot for it (true for many builtins), or a type written in Python that doesn't allow a __weakref__ attribute (usually due to __slots__).
If this is a problem, you can just use a normal dict for those types, but you'll have to clean it up yourself.
Quick answer
Is it possible to add attributes to built in python objects dynamically in Python?
No, the reasons your read about in the links you posted, are the same now days. But I came out with a recipe I think might be the starting point of your tracer.
Instrumenting using subclassing combined with AST
After reading a lot about this, I came out with a recipe that might not be the complete solution, but it sure looks like you can start from here.
The good thing about this recipe is that it doesn't use third-party libraries, all is achieved with the standard (Python 3.5, 3.6, 3.7) libraries.
The target code.
This recipe will make code like this be instrumented (simple instrumentation is performed here, this is just a poof of concept) and executed.
# target/target.py
d = {1: 2}
d.update({3: 4})
print(d) # Should print "{1: 2, 3: 4}"
print(d.hidden_field) # Should print "(0, 0)"
Subclassing
Fist we have to add the hidden_field to anything we want to (this recipe have been tested only with dictionaries).
The following code receives a value, finds out its type/class and subclass it in order to add the mentioned hidden_field.
def instrument_node(value):
VarType = type(value)
class AnalyserHelper(VarType):
def __init__(self, *args, **kwargs):
self.hidden_field = (0, 0)
super(AnalyserHelper, self).__init__(*args, **kwargs)
return AnalyserHelper(value)
with that in place you are able to:
d = {1: 2}
d = instrument_node(d)
d.update({3: 4})
print(d) # Do print "{1: 2, 3: 4}"
print(d.hidden_field) # Do print "(0, 0)"
At this point, we know already a way to "add instrumentation to a built-in dictionary" but there is no transparency here.
Modify the AST.
The next step is to "hide" the instrument_node call and we will do that using the ast Python module.
The following is an AST node transformer that will take any dictionary it finds and wrap it in an instrument_node call:
class AnalyserNodeTransformer(ast.NodeTransformer):
"""Wraps all dicts in a call to instrument_node()"""
def visit_Dict(self, node):
return ast.Call(func=ast.Name(id='instrument_node', ctx=ast.Load()),
args=[node], keywords=[])
return node
Putting all together.
With thats tools you can the write a script that:
Read the target code.
Parse the program.
Apply AST changes.
Compile it.
And execute it.
import ast
import os
from ast_transformer import AnalyserNodeTransformer
# instrument_node need to be in the namespace here.
from ast_transformer import instrument_node
if __name__ == "__main__":
target_path = os.path.join(os.path.dirname(__file__), 'target/target.py')
with open(target_path, 'r') as program:
# Read and parse the target script.
tree = ast.parse(program.read())
# Make transformations.
tree = AnalyserNodeTransformer().visit(tree)
# Fix locations.
ast.fix_missing_locations(tree)
# Compile and execute.
compiled = compile(tree, filename='target.py', mode='exec')
exec(compiled)
This will take our target code, and wraps every dictionary with an instrument_node() and execute the result of such change.
The output of running this against our target code,
# target/target.py
d = {1: 2}
d.update({3: 4})
print(d) # Will print "{1: 2, 3: 4}"
print(d.hidden_field) # Will print "(0, 0)"
is:
>>> {1: 2, 3: 4}
>>> (0, 0)
Working example
You can clone a working example here.
Yes, it is possible, it is one of the coolest things of python, in Python, all the classes are created by the typeclass
You can read in detail here, but what you need to do is this
In [58]: My_Class = type("My_Class", (My_Class,), {"__my_hidden_field__": X})
In [59]: My_Class.__my_hidden_field__
Out[59]:
array([[0.73998002, 0.68213825, 0.41621582],
[0.05936479, 0.14348496, 0.61119082]])
*Edited because inheritance was missing, you need to pass the original class as a second argument (in tuple) so that it updates, otherwise it simply re-writes the class)

Mocking a file containing JSON data with mock.patch and mock_open

I'm trying to test a method that requires the use of json.load in Python 3.6.
And after several attempts, I tried running the test "normally" (with the usual unittest.main() from the CLI), and in the iPython REPL.
Having the following function (simplified for purpose of the example)
def load_metadata(name):
with open("{}.json".format(name)) as fh:
return json.load(fh)
with the following test:
class test_loading_metadata(unittest2.TestCase):
#patch('builtins.open', new_callable=mock_open(read_data='{"disabled":True}'))
def test_load_metadata_with_disabled(self, filemock):
result = load_metadata("john")
self.assertEqual(result,{"disabled":True})
filemock.assert_called_with("john.json")
The result of the execution of the test file, yields a heart breaking:
TypeError: the JSON object must be str, bytes or bytearray, not 'MagicMock'
While executing the same thing in the command line, gives a successful result.
I tried in several ways (patching with with, as decorator), but the only thing that I can think of, is the unittest library itself, and whatever it might be doing to interfere with mock and patch.
Also checked versions of python in the virtualenv and ipython, the versions of json library.
I would like to know why what looks like the same code, works in one place
and doesn't work in the other.
Or at least a pointer in the right direction to understand why this could be happening.
json.load() simply calls fh.read(), but fh is not a mock_open() object. It's a mock_open()() object, because new_callable is called before patching to create the replacement object:
>>> from unittest.mock import patch, mock_open
>>> with patch('builtins.open', new_callable=mock_open(read_data='{"disabled":True}')) as filemock:
... with open("john.json") as fh:
... print(fh.read())
...
<MagicMock name='open()().__enter__().read()' id='4420799600'>
Don't use new_callable, you don't want your mock_open() object to be called! Just pass it in as the new argument to #patch() (this is also the second positional argument, so you can leave off the new= here):
#patch('builtins.open', mock_open(read_data='{"disabled":True}'))
def test_load_metadata_with_disabled(self, filemock):
at which point you can call .read() on it when used as an open() function:
>>> with patch('builtins.open', mock_open(read_data='{"disabled":True}')) as filemock:
... with open("john.json") as fh:
... print(fh.read())
...
{"disabled":True}
The new argument is the object that'll replace the original when patching. If left to the default, new_callable() is used instead. You don't want new_callable() here.

Use Python for Creating JSON

I want to use Python for creating JSON.
Since I found no library which can help me, I want to know if it's possible to inspect the order of the classes in a Python file?
Example
# example.py
class Foo:
pass
class Bar:
pass
If I import example, I want to know the order of the classes. In this case it is [Foo, Bar] and not [Bar, Foo].
Is this possible? If "yes", how?
Background
I am not happy with yaml/json. I have the vague idea to create config via Python classes (only classes, not instantiation to objects).
Answers which help me to get to my goal (Create JSON with a tool which is easy and fun to use) are welcome.
The inspect module can tell the line numbers of the class declarations:
import inspect
def get_classes(module):
for name, value in inspect.getmembers(module):
if inspect.isclass(value):
_, line = inspect.getsourcelines(value)
yield line, name
So the following code:
import example
for line, name in sorted(get_classes(example)):
print line, name
Prints:
2 Foo
5 Bar
First up, as I see it, there are 2 things you can do...
Continue pursuing to use Python source files as configuration files. (I won't recommend this. It's analogous to using a bulldozer to strike a nail or converting a shotgun to a wheel)
Switch to something like TOML, JSON or YAML for configuration files, which are designed for the job.
Nothing in JSON or YAML prevents them from holding "ordered" key-value pairs. Python's dict data type is unordered by default (at least till 3.5) and list data type is ordered. These map directly to object and array in JSON respectively, when using the default loaders. Just use something like Python's OrderedDict when deserializing them and voila, you preserve order!
With that out of the way, if you really want to use Python source files for the configuration, I suggest trying to process the file using the ast module. Abstract Syntax Trees are a powerful tool for syntax level analysis.
I whipped a quick script for extracting class line numbers and names from a file.
You (or anyone really) can use it or extend it to be more extensive and have more checks if you want for whatever you want.
import sys
import ast
import json
class ClassNodeVisitor(ast.NodeVisitor):
def __init__(self):
super(ClassNodeVisitor, self).__init__()
self.class_defs = []
def visit(self, node):
super(ClassNodeVisitor, self).visit(node)
return self.class_defs
def visit_ClassDef(self, node):
self.class_defs.append(node)
def read_file(fpath):
with open(fpath) as f:
return f.read()
def get_classes_from_text(text):
try:
tree = ast.parse(text)
except Exception as e:
raise e
class_extractor = ClassNodeVisitor()
li = []
for definition in class_extractor.visit(tree):
li.append([definition.lineno, definition.name])
return li
def main():
fpath = "/tmp/input_file.py"
try:
text = read_file(fpath)
except Exception as e:
print("Could not load file due to " + repr(e))
return 1
print(json.dumps(get_classes_from_text(text), indent=4))
if __name__ == '__main__':
sys.exit(main())
Here's a sample run on the following file:
input_file.py:
class Foo:
pass
class Bar:
pass
Output:
$ py_to_json.py input_file.py
[
[
1,
"Foo"
],
[
5,
"Bar"
]
]
If I import example,
If you're going to import the module, the example module to be on the import path. Importing means executing any Python code in the example module. This is a pretty big security hole - you're loading a user-editable file in the same context as the rest of the application.
I'm assuming that since you care about preserving class-definition order, you also care about preserving the order of definitions within each class.
It is worth pointing out that is now the default behavior in python, since python3.6.
Aslo see PEP 520: Preserving Class Attribute Definition Order.
(Moving my comments to an answer)
That's a great vague idea. You should give Figura a shot! It does exactly that.
(Full disclosure: I'm the author of Figura.)
I should point out the order of declarations is not preserved in Figura, and also not in json.
I'm not sure about order-preservation in YAML, but I did find this on wikipedia:
... according to the specification, mapping keys do not have an order
It might be the case that specific YAML parsers maintain the order, though they aren't required to.
You can use a metaclass to record each class's creation time, and later, sort the classes by it.
This works in python2:
class CreationTimeMetaClass(type):
creation_index = 0
def __new__(cls, clsname, bases, dct):
dct['__creation_index__'] = cls.creation_index
cls.creation_index += 1
return type.__new__(cls, clsname, bases, dct)
__metaclass__ = CreationTimeMetaClass
class Foo: pass
class Bar: pass
classes = [ cls for cls in globals().values() if hasattr(cls, '__creation_index__') ]
print(sorted(classes, key = lambda cls: cls.__creation_index__))
The standard json module is easy to use and works well for reading and writing JSON config files.
Objects are not ordered within JSON structures but lists/arrays are, so put order dependent information into a list.
I have used classes as a configuration tool, the thing I did was to derive them from a base class which was customised by the particular class variables. By using the class like this I did not need a factory class. For example:
from .artifact import Application
class TempLogger(Application): partno='03459'; path='c:/apps/templog.exe'; flag=True
class GUIDisplay(Application): partno='03821'; path='c:/apps/displayer.exe'; flag=False
in the installation script
from .install import Installer
import app_configs
installer = Installer(apps=(TempLogger(), GUIDisplay()))
installer.baseline('1.4.3.3475')
print installer.versions()
print installer.bill_of_materials()
One should use the right tools for the job, so perhaps python classes are not the right tool if you need ordering.
Another python tool I have used to create JSON files is Mako templating system. This is very powerful. We used it to populate variables like IP addresses etc into static JSON files that were then read by C++ programs.
I'm not sure if this is answers your question, but it might be relevant. Take a look at the excellent attrs module. It's great for creating classes to use as data types.
Here's an example from glyph's blog (creator of Twisted Python):
import attr
#attr.s
class Point3D(object):
x = attr.ib()
y = attr.ib()
z = attr.ib()
It saves you writing a lot of boilerplate code - you get things like str representation and comparison for free, and the module has a convenient asdict function which you can pass to the json library:
>>> p = Point3D(1, 2, 3)
>>> str(p)
'Point3D(x=1, y=2, z=3)'
>>> p == Point3D(1, 2, 3)
True
>>> json.dumps(attr.asdict(p))
'{"y": 2, "x": 1, "z": 3}'
The module uses a strange naming convention, but read attr.s as "attrs" and attr.ib as "attrib" and you'll be okay.
Just touching the point about creating JSON from python. there is an excellent library called jsonpickle which lets you dump python objects to json. (and using this alone or with other methods mentioned here you can probably get what you wanted)

Easy Python ASync. Precompiler?

imagine you have an io heavy function like this:
def getMd5Sum(path):
with open(path) as f:
return md5(f.read()).hexdigest()
Do you think Python is flexible enough to allow code like this (notice the $):
def someGuiCallback(filebutton):
...
path = filebutton.getPath()
md5sum = $getMd5Sum()
showNotification("Md5Sum of file: %s" % md5sum)
...
To be executed something like this:
def someGuiCallback_1(filebutton):
...
path = filebutton.getPath()
Thread(target=someGuiCallback_2, args=(path,)).start()
def someGuiCallback_2(path):
md5sum = getMd5Sum(path)
glib.idle_add(someGuiCallback_3, md5sum)
def someGuiCallback_3(md5sum):
showNotification("Md5Sum of file: %s" % md5sum)
...
(glib.idle_add just pushes a function onto the queue of the main thread)
I've thought about using decoraters, but they don't allow me to access the 'content' of the function after the call. (the showNotification part)
I guess I could write a 'compiler' to change the code before execution, but it doesn't seam like the optimal solution.
Do you have any ideas, on how to do something like the above?
You can use import hooks to achieve this goal...
PEP 302 - New Import Hooks
PEP 369 - Post Import Hooks
... but I'd personally view it as a little bit nasty.
If you want to go down that route though, essentially what you'd be doing is this:
You add an import hook for an extension (eg ".thpy")
That import hook is then responsible for (essentially) passing some valid code as a result of the import.
That valid code is given arguments that effectively relate to the file you're importing.
That means your precompiler can perform whatever transformations you like to the source on the way in.
On the downside:
Whilst using import hooks in this way will work, it will surprise the life out of any maintainer or your code. (Bad idea IMO)
The way you do this relies upon imputil - which has been removed in python 3.0, which means your code written this way has a limited lifetime.
Personally I wouldn't go there, but if you do, there's an issue of the Python Magazine where doing this sort of thing is covered in some detail, and I'd advise getting a back issue of that to read up on it. (Written by Paul McGuire, April 2009 issue, probably available as PDF).
Specifically that uses imputil and pyparsing as it's example, but the principle is the same.
How about something like this:
def performAsync(asyncFunc, notifyFunc):
def threadProc():
retValue = asyncFunc()
glib.idle_add(notifyFunc, retValue)
Thread(target=threadProc).start()
def someGuiCallback(filebutton):
path = filebutton.getPath()
performAsync(
lambda: getMd5Sum(path),
lambda md5sum: showNotification("Md5Sum of file: %s" % md5sum)
)
A bit ugly with the lambdas, but it's simple and probably more readable than using precompiler tricks.
Sure you can access function code (already compiled) from decorator, disassemble and hack it. You can even access the source of module it's defined in and recompile it. But I think this is not necessary. Below is an example using decorated generator, where yield statement serves as a delimiter between synchronous and asynchronous parts:
from threading import Thread
import hashlib
def async(gen):
def func(*args, **kwargs):
it = gen(*args, **kwargs)
result = it.next()
Thread(target=lambda: list(it)).start()
return result
return func
#async
def test(text):
# synchronous part (empty in this example)
yield # Use "yield value" if you need to return meaningful value
# asynchronous part[s]
digest = hashlib.md5(text).hexdigest()
print digest

Categories