I have some machine learning code (but for the purposes of my question, it could be any algorithm) written in Python under git version control. My colleague has just concluded a significant refactor of the code base on a feature branch. Now I want to write some acceptance tests to make sure that the results of querying the machine learning process in production are within a reasonable statistical range compared to the same query on the refactored branch of the code base.
This is a somewhat complex app running in docker containers. The tests need to be run as part of a much larger series of tests.
I am not confused about how to write the part of the test that determines whether the results are within a reasonable statistical range. I am confused about how to bring the results from the master branch code together with the results from the WIP branch for comparison in one test in an automated way
So far, my best (only) idea is to launch the production docker container, run the machine learning query, write the query result to a csv file and then to use the docker cp to copy this file to the local machine where it can be tested against the equivalent csv on the feature branch. However, this seems inelegant.
Is there a better/smarter/best practice way to do this? Ideally that keeps things in memory.
I would consider using the http://approvaltests.com/ framework, you just need to write the test code that will produce some output after executing the Python code being tested, the output can be anything (text, JSON/CSV file, etc).
You can run this test on the master branch first so it records the output as the approved baseline and then you can switch to your WIP branch and run the same test there if output differs the approval test will fail.
Check out this podcast episode for more info.
Related
I know the question title is weird!.
I have two virtual machines. First one has limited resources, while the second one has enough resources just like normal machine. The first machine will receive a signal from an external device. This signal will trigger a python compiler to execute a script. The script is big and the first machine does not have enough resources to execute it.
I can copy the script to the second machine to run it there, but I can't make the second machine receive the external signal. I am wondering if there is a way to make the compiler on the first machine ( once the external signal received) call the compiler on the second machine, so the compiler on the second machine executes the script? so the second compiler should use the second machine resources. check the attached image please.
Assume that the connection is established between the two machines and they can see each other, and the second machine has a copy from the script. I just need the commands that pass ( the execution ) to the second machine and make it use its own resources.
You should look into the microservice architecture to do this.
You can achieve this either by using flask and sending server requests between each machine, or something like nameko, which will allow you to create a "bridge" between machines and call functions between them (seems like what you are more interested in). Example for nameko:
Machine 2 (executor of resource-intensive script):
from nameko.rpc import rpc
class Stuff(object):
#rpc
def example(self):
return "Function running on Machine 2."
You would run the above script through the Nameko shell, as detailed in the docs.
Machine 1:
from nameko.standalone.rpc import ClusterRpcProxy
# This is the amqp server that machine 2 would be running.
config = {
'AMQP_URI': AMQP_URI # e.g. "pyamqp://guest:guest#localhost"
}
with ClusterRpcProxy(config) as cluster_rpc:
cluster_rpc.Stuff.example() # Function running on Machine 2.
More info here.
Hmm, there's many approaches to this problem.
If you want a python only solution, you can check out
dispy http://dispy.sourceforge.net/
Or Dask. https://dask.org/
If you want a robust solution (what I use on my home computing cluster but imo overkill for your problem) you can use
SLURM. SLURM is basically a way to string multiple computers together into a "supercomputer". https://slurm.schedmd.com/documentation.html
For a semi-quick, hacky solution. You can write a microservice. Essentially, your "weak" computer will receive the message then send a http request to your "strong" computer. Your strong computer will contain the actual program, compute results, and pass back the result to your "weak" computer.
Flask is an easy and lightweight solution for this.
All of these solutions require some type of networking. At the least, the computers need to be on the same LAN or both have access over the web.
There are many other approaches not mentioned. For example, you can export a NFS (netowrk file storage) and have one computer put a file in the shared folder and the other computer perform work on the file. I'm sure there are plenty other contrived ways to accomplish this task :). I'd be happy to expand on a particular method if you want.
I'm working on setting up my team's new unit test and integration test infrastructure and want to make sure I'm starting off by selecting the correct test frameworks. I'm an embedded developer testing code running on a VxWorks operating system with a C/C++ production codebase.
We need a framework capable of directly testing C/C++ for unit testing, so for our unit tests I chose Googletest as our framework.
However, for integration tests we've generally tested using Python scripts (with no test framework). The Python scripts connect to the embedded system over a network and test cases via sending commands and receiving telemetry.
Would using pytest as a test framework be beneficial to the way we're currently using Python for integration testing an embedded system? Most of the examples I've seen use pytest in a more unit test fashion by creating assertions for single functions in a Python production codebase.
EDIT:
Per hoefling's comment, i'll provide a (very simplified) example of one of our existing Python integration test cases, and also what I believe its corresponding Pytest implementation would be.
#Current example
def test_command_counter():
preTestCmdCount = getCmdCountFromSystem()
sendCommandToSystem()
postTestCmdCount = getCmdCountFromSystem()
if (postTestCmdCount != (preTestCmdCount + 1)):
print("FAIL: Command count did not increment!")
else:
print("PASS")
#Using Pytest?
def test_command_counter():
preTestCmdCount = getCmdCountFromSystem()
sendCommandToSystem()
postTestCmdCount = getCmdCountFromSystem()
assert postTestCmdCount == (preTestCmdCount + 1)
So, correct me if I'm wrong, but it appears that the advantages of using Pytest over plain Python for this simplified case would be:
Being able to make use of pytest's automated test case discovery, so that I can easily run all of my test functions instead of having to create custom code to do so.
Being able to make use of the 'assert' syntax which will automatically generate pass/fail statements for each test instead of having to manually implement pass/fail print statements for each test case
I've been in a similar situation, and from what I gathered unit testing frameworks are NOT appropriate for integration testing on embedded systems. A similar question was asked here:
Test framework for testing embedded systems in Python
We personally use Google's OpenHTF to automate both the integration testing (as in your case) and the production verification testing, which includes bringup, calibration and overall verification of assembly.
Check it out: https://github.com/google/openhtf
We automate advanced test equipment such as RF switches and Spectrum Analysers all in Python in order to test and calibrate our RF units, which operate in the >500 MHz range.
Using OpenHTF, you can create complete tests with measurements very quickly. It provides you with a builtin web GUI and allows you to write your custom export 'callbacks'.
We're currently building a complete test solution for hardware testing. I'd be glad to help with OpenHTF if needed, as we're basing our flagship implementation on it.
This thread's old, but these suggestions might help someone...
For unit testing embedded C on a host PC and on target we use Unity and CMock. http://www.throwtheswitch.org/
For hardware-in-the-loop testing we use pytest.
Both work well and are part of our Jenkins release workflow.
I have one server which runs multiple containers
Nginx
Portainer
Several custom HTTP servers
RabbitMQ
I have a folder structure like this in the home directoty
/docker/dockerfiles/nginx/Dockerfile
/docker/dockerfiles/nginx/README
/docker/dockerfiles/nginx/NOTES
/docker/dockerfiles/portainer/Dockerfile
...
/docker/dockerfiles/rabbitmq/Dockerfile
/docker/volumes/nginx/sites/...
/docker/volumes/nginx/logs/...
/docker/volumes/portainer/
...
/docker/volumes/rabbitmq/
/docker/volumes/ contains all the files which the docker containers use, they are mapped into the containers, the containers don't use real Docker volumes and I really want to avoid using them.
I also have 3 Python files:
containers_info.py
containers_build.py
containers_run.py
containers_info.py is basically a dictionary holding rudimentary information about the containers, like the version of the container and the build date, if it should be excluded/included in a build pass, if it should get included/excluded in a run pass
containers_build.py imports containers_info.py and checks which containers should be built, reads the corresponding Dockerfile from /docker/dockerfiles/.../Dockerfile and then builds the container(s) with the Docker Python API, collects some stats and creates summaries, notifies of failures and the like.
containers_run.py also imports containers_info.py and checks which containers should be run. It contains the information of which volumes to map to, which ports to use, basically all the stuff that would go in a YAML file to describe the container and a bit of management of the currently running container along with it.
It contains multiple snippets like
def run_websites(info):
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
container_name = 'websites'
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
new_container_name = container_name
if info['auto-run']: rename_and_stop_container(container_name)
else: new_container_name = container_name + '-prep'
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
container = client.containers.run(
detach=True,
name=new_container_name,
hostname='docker-websites',
image='myscope/websites:{}'.format(versions['websites']),
command='python -u server.py settings:docker-lean app:websites id:hp-1 port:8080 domain:www.example.com',
ports={'8080/tcp': ('172.17.0.1', 10001)},
working_dir='/home/user/python/server/app',
volumes={
'/home/user/docker/volumes/websites': {'bind': '/home/user/python/server', 'mode': 'rw'},
}
)
#patch = 'sed -i.bak s/raise\ ImportError/name\ =\ \\"libc.so.6\\"\ #\ raise\ ImportError/g /usr/lib/python2.7/site-packages/twisted/python/_inotify.py'
#print container.exec_run(patch, user='root')
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
if info['auto-run'] and info['auto-delete-old']: remove_container(container_name + '-old')
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Now I want to move away from this custom solution and use something open source, which will allow me to scale this approach to multiple machines. Currently I can copy the ~/docker/ among servers and execute the modified scripts to obtain the machines I need, but I think that Docker Swarm or Kubernetes is designed to solve these issues. At least somehow that's the impression I have.
My Python solution was born while I was learning Docker, automating it via the Docker Python API helped me a lot with learning Dockerfiles, since I could automate the entire process and mistakes in the Dockerfiles would only mean a little bit of lost time.
Another important benefit of this Python script approach was that I was able to automate the creation of dozens if instances of the webserver on the same machine (assuming that this would make sense to do) and have Nginx adapt perfectly to this change (adding/removing proxies dynamically, reloading configuration).
So, which technology should I start looking into, in order to replace my current system with it? Also, I don't intend to run many machines, initially only two (main+backup), but would, at any point in time, like to be able to add more machines and distribute the load among them, and that by just changing some settings in a configuration file.
Which is the current approach to solve these issues?
There are a number of tools you could use in this scenario. If you just plan on using a single machine, docker-compose could be the solution you are looking for. It uses a yaml style makefile and supports the same build-context (as do standard Docker and kubernetes). It is really easy to get multiple instances of a pod or container running, just using the --scale flag eliminates a lot of the headache
If you are planning on running this on multiple machines, I'd say kubernetes is probably going to be your best bet. It's really well set up for it. Admittedly, I don't have a lot of experience in Swarm, but it's analogous from what I understand. The benefit there is that kubernetes can also handle the load-balancing for you, whereas docker-compose does not, and you'd have to use some sort of proxy (like Nginx) for that. It's not horrible, but also not the most straightforward if you haven't done something like that before
I have a script that gets a file input plus some info, runs a couple of (possibly interdependent) programs on it using subprocess module, and distributes the output over the file-system.
Only a few parts can be tested in isolation by traditional unit-testing, so I'm searching a convenient way to automate the integration-testing (see if the output files exist in the right locations, in the right number, of the right size, etc).
I initially thought that setUp and tearDown methods from the default unittest module could help me, but they are re-run with each test, not once for the entire test suite, so it is not an option. Is there any way to make the unittest module run a global setUp and tearDown once? Or an alternative module/tool that I can use? Eclipse/PyDev integration would be a bonus.
What is the latest way to write Python tests? What modules/frameworks to use?
And another question: are doctest tests still of any value? Or should all the tests be written in a more modern testing framework?
Thanks, Boda Cydo.
The usual way is to use the builtin unittest module for creating unit tests and bundling them together to test suites which can be run independently. unittest is very similar to (and inspired by) jUnit and thus very easy to use.
If you're interested in the very latest changes, take a look at the new PyCon talk by Michael Foord:
PyCon 2010: New and Improved: Coming changes to unittest
Using the built-in unittest module is as relevant and easy as ever. The other unit testing options, py.test,nose, and twisted.trial are mostly compatible with unittest.
Doctests are of the same value they always were—they are great for testing your documentation, not your code. If you are going to put code examples in your docstrings, doctest can assure you keep them correct and up to date. There's nothing worse than trying to reproduce an example and failing, only to later realize it was actually the documentation's fault.
I don't know much about doctests, but at my university, nose testing is taught and encouraged.
Nose can be installed by following this procedure (I'm assuming you're using a PC - Windows OS):
install setuptools
Run DOS Command Prompt (Start -> All Programs -> Accessories -> Command Prompt)
For this step to work, you must be connected to the internet. In DOS, type: C:\Python25\Scripts\easy_install nose
If you are on a different OS, check this site
EDIT:
It's been two years since I originally wrote this post. Now, I've learned of this programming principle called Designing by Contract. This allows a programmer to define preconditions, postconditions and invariants (called contracts) for all functions in their code. The effect is that an error is raised if any of these contracts are violated.
The DbC framework that I would recommend for python is called PyContract I have successfully used it in my evolutionary programming framework
In my current project I'm using unittest, minimock, nose. In the past I've made heavy use of doctests, but in a large projects some tests can get kinda unwieldy, so I tend to reserve usage of doctests for simpler functions.
If you are using setuptools or distribute (you should be switching to distribute), you can set up nose as the default test collector so that you can run your tests with "python setup.py test"
setup(name='foo',
...
test_suite='nose.collector',
...
Now running "python setup.py test" will invoke nose, which will crawl your project for things that look like tests and run them, accumulating the results. If you also have doctests in your project, you can run nosetests with the --with-doctest option to enable the doctest plugin.
nose also has integration with coverage
nosetests --with-coverage.
You can also use the --cover-html --cover-html-dir options to generate an HTML coverage report for each module, with each line of code that is not under test highlighted. I wouldn't get too obsessed with getting coverage to report 100% test coverage for all modules. Some code is better left for integration tests, which I'll cover at the end.
I have become a huge fan of minimock, as it makes testing code with a lot of external dependencies really easy. While it works really well when paired with doctest, it can be used with any testing framework using the unittest.TraceTracker class. I would encourage you to avoid using it to test all of your code though, since you should still try to write your code so that each translation unit can be tested in isolation without mocking. Sometimes that's not possible though.
Here is an (untested) example of such a test using minimock and unittest:
# tests/test_foo.py
import minimock
import unittest
import foo
class FooTest(unittest2.TestCase):
def setUp(self):
# Track all calls into our mock objects. If we don't use a TraceTracker
# then all output will go to stdout, but we want to capture it.
self.tracker = minimock.TraceTracker()
def tearDown(self):
# Restore all objects in global module state that minimock had
# replaced.
minimock.restore()
def test_bar(self):
# foo.bar invokes urllib2.urlopen, and then calls read() on the
# resultin file object, so we'll use minimock to create a mocked
# urllib2.
urlopen_result = minimock.Mock('urlobject', tracker=self.tracker)
urlopen_result.read = minimock.Mock(
'urlobj.read', tracker=self.tracker, returns='OMG')
foo.urllib2.urlopen = minimock.Mock(
'urllib2.urlopen', tracker=self.tracker, returns=urlopen_result)
# Now when we call foo.bar(URL) and it invokes
# *urllib2.urlopen(URL).read()*, it will not actually send a request
# to URL, but will instead give us back the dummy response body 'OMG',
# which it then returns.
self.assertEquals(foo.bar('http://example.com/foo'), 'OMG')
# Now we can get trace info from minimock to verify that our mocked
# urllib2 was used as intended. self.tracker has traced our calls to
# *urllib2.urlopen()*
minimock.assert_same_trace(self.tracker, """\
Called urllib2.urlopen('http://example.com/foo)
Called urlobj.read()
Called urlobj.close()""")
Unit tests shouldn't be the only kinds of tests you write though. They are certainly useful and IMO extremely important if you plan on maintaining this code for any extended period of time. They make refactoring easier and help catch regressions, but they don't really test the interaction between various components and how they interact (if you do it right).
When I start getting to the point where I have a mostly finished product with decent test coverage that I intend to release, I like to write at least one integration test that runs the complete program in an isolated environment.
I've had a lot of success with this on my current project. I had about 80% unit test coverage, and the rest of the code was stuff like argument parsing, command dispatch and top level application state, which is difficult to cover in unit tests. This program has a lot of external dependencies, hitting about a dozen different web services and interacting with about 6,000 machines in production, so running this in isolation proved kinda difficult.
I ended up writing an integration test which spawns a WSGI server written with eventlet and webob that simulates all of the services my program interacts with in production. Then the integration test monkey patches our web service client library to intercept all HTTP requests and send them to the WSGI application. After doing that, it loads a state file that contains a serialized snapshot of the state of the cluster, and invokes the application by calling it's main() function. Now all of the external services my program interacts with are simulated, so that I can run my program as it would be run in production in a repeatable manner.
The important thing to remember about doctests is that the tests are based on string comparisons, and the way that numbers are rendered as strings will vary on different platforms and even in different python interpreters.
Most of my work deals with computations, so I use doctests only to test my examples and my version string. I put a few in the __init__.py since that will show up as the front page of my epydoc-generated API documentation.
I use nose for testing, although I'm very interested in checking out the latest changes to py.test.