Framework for regression testing of numerical code - python

My python code is performing fairly complex numerical calculations, and in many cases I am unable to provide known solutions to enable unit testing (especially for intermediate results).
However, I have found that I can catch a lot of bugs with nose, by performing regression testing using the following workflow:
Write test code to solve some relatively small problem
Run once, inspect the results (often in the form of a matplotlib plot), and decide by comparison with analytical results or other numerical software or physical intuition that the results are correct to within acceptable numerical accuracy.
Save the resulting numpy arrays to text files to act as a reference (FWIW I was avoiding numpy's saving routines as a workaround for this bug, but as this has been fixed in a released version think I can use them now).
The test code performs the calculation, and compares it with the reference data read in from the file using numpy's assert_allclose.
The test function is written in such a way as that by default it performs the test, but by passing non-default values for arguments I can plot the results and overwrite the reference file if it becomes necessary. The reference file is checked into git so there is little risk of accidentally overwriting the test values without noticing.
However, I find myself writing a lot of boilerplate code to implement the above functionality, which outweighs the actual test code itself. Cleaning this up would make it much easier to increase test coverage.
Is there some python testing framework or plugin for nose that could easily automate the above workflow?

A few months ago I wrote the nrtest utility in an attempt to make this workflow easier. It sounds like it might help you too.
Here's a quick overview. Each test is defined by its input files and its expected output files. Following execution, output files are stored in a portable benchmark directory. A second step then compares this benchmark to a reference benchmark. A recent update has enabled user extensions, so you can define comparison functions for your custom data.
I hope it helps.

Related

Is there a GPU/parallelized python implementation of k-NN which does not rebuild the KDTree upon the addition or subtraction of one point?

I have a dynamic point cloud in 3D, and I would like to use nanoflann to dynamically add/subtract points in between queries, without rebuilding the tree (as seen here):
https://github.com/jlblancoc/nanoflann/blob/master/examples/dynamic_pointcloud_example.cpp
I have found a python wrapper for nanoflann as well (awesome!):
https://github.com/u1234x1234/pynanoflann
This is great! However, speed is extremely important for my application so I will almost certainly need to parallelize this k-NN implementation. Is there an existing python implementation or wrapper for such a dynamic k-NN implementation written with OpenCL or CUDA? I wanted to check if one exists before writing my own. This "dynamic" is different than wanting to be able to specify a new k with each query. I don't mind if it is a single KDTree with a set k. I simply need to be able to remove or add points in between queries without rebuilding the KDTree.
Thank you in advance!

How fast is the R sparklyr package as a front end for Spark?

I've read online that Scala is faster than Python, e.g. here. I've also seen a comparison between different front ends that concluded that R was so slow that the tester gave up trying to measure its performance (here; although this was specifically a test for user-defined functions and may not have used the sparklyr package).
I also know that sparklyr now has arrow integration, which has led to performance improvements for user defined functions as well as copying data to/from the cluster, as shown here.
My question: how fast is sparklyr compared to Python/Scala? I'm mostly interested in standard 'out-of-the-box' functions, but would also be interested to know how it stacks up for user-defined functions now that arrow has been integrated. And are there particular circumstances in which it performs well or badly?
I ask because I've built an app in sparklyr that is slower than I would have hoped despite lots of tinkering with tuning parameters, and I'm wondering if this is partly because of limitations in the package.
in a local PC, most of the time, R is much faster than spark.

Auto-memoizing Python interpreter for machine learning(Incpy alternative?)

I am working on a machine learning project in python and many times I found myself rerun some algorithm with different tweaks each time(changing few parameters, different normalization, some extra feature engineering, etc). Each time most computation is similar except a few steps. I can, of course, save some immediate states on disk and load it next time instead of the computing the same thing over and over again.
The thing is that there are so many such immediate results that manually save them and keep a record of them would be a pain. I looked at some python decorator here that can make things a bit easier. However, the problem with this implementation is, that it will always return the same result from the first time you called the function, even when your function has arguments and therefore should produce different results for different arguments. I really need to memorize the output of a function with different arguments.
I googled extensively on this topic and the closest thing that I found that is IncPy by Philip Guo. IncPy (Incremental Python) is an enhanced Python interpreter that speeds up script execution times by automatically memoizing (caching) the results of long-running function calls and then re-using those results rather than re-computing, when safe to do so.
I really like the idea and think it will be very useful for data science and machine learning but the code is written nine years ago for python 2.6 and is no longer maintained.
So my question is that is there any other alternative automatical caching/memorizing techniques in python that can handle relatively large dataset?

Scipy minimize function seems to be creating multiple threads by itself?

I am using the scipy minimize function. The function that it's calling was compiled with Cython and has an underlying C++ implementation that I wrote, but that shouldn't really matter. For some reason when I run my program, it creates as many threads as it can to fill all my cpus. For example if I run top I see that 800% of a cpu is being used or on htop I can see that 8 individual processors are being used, when I only created the program to be run on one. I didn't think that scipy even had parallel processing functionality and I can't find any documentation related to this. What could possible be going on and is there any way to control it?
If some BLAS-implementation (with threading-support) is available (default on Ubuntu for example), some expressions like np.dot() (only the dense case as far as i know) will automatically be run in parallel (reference). Another possible example is sparse-matrix factorization with SuperLU.
Of course different minimizers will behave different.
Newton-type methods (core: solve a system of sparse linear-equations) are probably based on SuperLU (if the code is not one of the common old Fortran/C ones, where the whole code is self-contained). CG-type methods are heavily based on matrix-vector products (np.dot; so the dense-case will be parallel).
For some control over this, start with this SO question.

Automatic CudaMat conversion in Python

I'm looking into speeding up my python code, which is all matrix math, using some form of CUDA. Currently my code is using Python and Numpy, so it seems like it shouldn't be too difficult to rewrite it using something like either PyCUDA or CudaMat.
However, on my first attempt using CudaMat, I realized I had to rearrange a lot of the equations in order to keep the operations all on the GPU. This included the creation of many temporary variables so I could store the results of the operations.
I understand why this is necessary, but it makes what were once easy to read equations into somewhat of a mess that difficult to inspect for correctness. Additionally, I would like to be able to easily modify the equations later on, which isn't in their converted form.
The package Theano manages to do this by first creating a symbolic representation of the operations, then compiling them to CUDA. However, after trying Theano out for a bit, I was frustrated by how opaque everything was. For example, just getting the actual value for myvar.shape[0] is made difficult since the tree doesn't get evaluated until much later. I would also much prefer less of a framework in which my code much conform to a library that acts invisibly in the place of Numpy.
Thus, what I would really like is something much simpler. I don't want automatic differentiation (there are other packages like OpenOpt that can do that if I require it), or optimization of the tree, but just a conversion from standard Numpy notation to CudaMat/PyCUDA/somethingCUDA. In fact, I want to be able to have it evaluate to just Numpy without any CUDA code for testing.
I'm currently considering writing this myself, but before even consider such a venture, I wanted to see if anyone else knows of similar projects or a good starting place. The only other project I know that might be close to this is SymPy, but I don't know how easy it would be to adapt to this purpose.
My current idea would be to create an array class that looked like a Numpy.array class. It's only function would be to build a tree. At any time, that symbolic array class could be converted to a Numpy array class and be evaluated (there would also be a one-to-one parity). Alternatively, the array class could be traversed and have CudaMat commands be generated. If optimizations are required they can be done at that stage (e.g. re-ordering of operations, creation of temporary variables, etc.) without getting in the way of inspecting what's going on.
Any thoughts/comments/etc. on this would be greatly appreciated!
Update
A usage case may look something like (where sym is the theoretical module), where we might be doing something such as calculating the gradient:
W = sym.array(np.rand(size=(numVisible, numHidden)))
delta_o = -(x - z)
delta_h = sym.dot(delta_o, W)*h*(1.0-h)
grad_W = sym.dot(X.T, delta_h)
In this case, grad_W would actually just be a tree containing the operations that needed to be done. If you wanted to evaluate the expression normally (i.e. via Numpy) you could do:
npGrad_W = grad_W.asNumpy()
which would just execute the Numpy commands that the tree represents. If on the other hand, you wanted to use CUDA, you would do:
cudaGrad_W = grad_W.asCUDA()
which would convert the tree into expressions that can executed via CUDA (this could happen in a couple of different ways).
That way it should be trivial to: (1) test grad_W.asNumpy() == grad_W.asCUDA(), and (2) convert your pre-existing code to use CUDA.
Have you looked at the GPUArray portion of PyCUDA?
http://documen.tician.de/pycuda/array.html
While I haven't used it myself, it seems like it would be what you're looking for. In particular, check out the "Single-pass Custom Expression Evaluation" section near the bottom of that page.

Categories