If I have an application built, what is the protocol for testing the actual application?
I'm just understanding testing, and a test for an extension where you'd construct a shell application and then test your extension makes sense to me, but not if I want to test parts of an actual application I'm constructing.
I'm wondering if someone has any pointers, guides, or thoughts on how they go about packaging and testing their flask applications. What I've tried so far (importing the app into a test and starting to build tests for it) has been both unpleasant and unsuccessful. I'm at a point where I know I need the application to X,Y,Z and I can save a lot of future time by building a test that ensures X,Y,Z happens. however building a separate test application would be time costly and unproductive it would seem.
There are many ways to test your application.
Flask's documentation provides information on how to initlialize your app and make requests to it:
import flaskr
class FlaskrTestCase(unittest.TestCase):
def setUp(self):
self.db_fd, flaskr.app.config['DATABASE'] = tempfile.mkstemp()
self.app = flaskr.app.test_client()
flaskr.init_db()
def tearDown(self):
os.close(self.db_fd)
os.unlink(flaskr.DATABASE)
def test_empty_db(self):
rv = self.app.get('/')
assert 'No entries here so far' in rv.data
This allows you to request any route using their test_client. You can request every route in your application and assert that it is returning the data you expect it to.
You can write tests to do test specific functions too, Just import the function and test it accordingly.
You shouldn't have to write "a separate test application" at all. However, you might have to do things like loading test data, mocking objects/classes, these are not very straightforward, but there exist many blog posts/python tools that will help you do them
I would suggest reading the flask testing documentation to start with, They provide a great overview.
Additionally, it would be extremely helpful for you to provide specifics.
What I've tried so far (importing the app into a test and starting to
build tests for it) has been both unpleasant and unsuccessful
Is not that constructive. Are you getting errors when you execute your tests? If so could you post them. How is it unsuccessful? Do you not know which parts to test? Are you having problems executing your tests? Loading data? Testing the responses? Errors?
Related
I would like to measure the coverage of my Python code which gets executed in the production system.
I want an answer to this question:
Which lines get executed often (hot spots) and which lines are never used (dead code)?
Of course this must not slow down my production site.
I am not talking about measuring the coverage of tests.
I assume you are not talking about test suite code coverage which the other answer is referring to. That is a job for CI indeed.
If you want to know which code paths are hit often in your production system, then you're going to have to do some instrumentation / profiling. This will have a cost. You cannot add measurements for free. You can do it cheaply though and typically you would only run it for short amounts of time, long enough until you have your data.
Python has cProfile to do full profiling, measuring call counts per function etc. This will give you the most accurate data but will likely have relatively high impact on performance.
Alternatively, you can do statistical profiling which basically means you sample the stack on a timer instead of instrumenting everything. This can be much cheaper, even with high sampling rate! The downside of course is a loss of precision.
Even though it is surprisingly easy to do in Python, this stuff is still a bit much to put into an answer here. There is an excellent blog post by the Nylas team on this exact topic though.
The sampler below was lifted from the Nylas blog with some tweaks. After you start it, it fires an interrupt every millisecond and records the current call stack:
import collections
import signal
class Sampler(object):
def __init__(self, interval=0.001):
self.stack_counts = collections.defaultdict(int)
self.interval = interval
def start(self):
signal.signal(signal.VTALRM, self._sample)
signal.setitimer(signal.ITIMER_VIRTUAL, self.interval, 0)
def _sample(self, signum, frame):
stack = []
while frame is not None:
formatted_frame = '{}({})'.format(
frame.f_code.co_name,
frame.f_globals.get('__name__'))
stack.append(formatted_frame)
frame = frame.f_back
formatted_stack = ';'.join(reversed(stack))
self.stack_counts[formatted_stack] += 1
signal.setitimer(signal.ITIMER_VIRTUAL, self.interval, 0)
You inspect stack_counts to see what your program has been up to. This data can be plotted in a flame-graph which makes it really obvious to see in which code paths your program is spending the most time.
If i understand it right you want to learn which parts of your application is used most often by users.
TL;DR;
Use one of the metrics frameworks for python if you do not want to do it by hand. Some of them are above:
DataDog
Prometheus
Prometheus Python Client
Splunk
It is usually done by function level and it actually depends on application;
If it is a desktop app with internet access:
You can create a simple db and collect how many times your functions are called. For accomplish it you can write a simple function and call it inside every function that you want to track. After that you can define an asynchronous task to upload your data to internet.
If it is a web application:
You can track which functions are called from js (mostly preferred for user behaviour tracking) or from web api. It is a good practice to start from outer to go inner. First detect which end points are frequently called (If you are using a proxy like nginx you can analyze server logs to gather information. It is the easiest and cleanest way). After that insert a logger to every other function that you want to track and simply analyze your logs for every week or month.
But you want to analyze your production code line by line (it is a very bad idea) you can start your application with python profilers. Python has one already: cProfile.
Maybe make a text file and through your every program method just append some text referenced to it like "Method one executed". Run the web application like 10 times thoroughly as a viewer would and after this make a python program that reads the file and counts a specific parts of it or maybe even a pattern and adds it to a variable and outputs the variables.
I'm working on setting up my team's new unit test and integration test infrastructure and want to make sure I'm starting off by selecting the correct test frameworks. I'm an embedded developer testing code running on a VxWorks operating system with a C/C++ production codebase.
We need a framework capable of directly testing C/C++ for unit testing, so for our unit tests I chose Googletest as our framework.
However, for integration tests we've generally tested using Python scripts (with no test framework). The Python scripts connect to the embedded system over a network and test cases via sending commands and receiving telemetry.
Would using pytest as a test framework be beneficial to the way we're currently using Python for integration testing an embedded system? Most of the examples I've seen use pytest in a more unit test fashion by creating assertions for single functions in a Python production codebase.
EDIT:
Per hoefling's comment, i'll provide a (very simplified) example of one of our existing Python integration test cases, and also what I believe its corresponding Pytest implementation would be.
#Current example
def test_command_counter():
preTestCmdCount = getCmdCountFromSystem()
sendCommandToSystem()
postTestCmdCount = getCmdCountFromSystem()
if (postTestCmdCount != (preTestCmdCount + 1)):
print("FAIL: Command count did not increment!")
else:
print("PASS")
#Using Pytest?
def test_command_counter():
preTestCmdCount = getCmdCountFromSystem()
sendCommandToSystem()
postTestCmdCount = getCmdCountFromSystem()
assert postTestCmdCount == (preTestCmdCount + 1)
So, correct me if I'm wrong, but it appears that the advantages of using Pytest over plain Python for this simplified case would be:
Being able to make use of pytest's automated test case discovery, so that I can easily run all of my test functions instead of having to create custom code to do so.
Being able to make use of the 'assert' syntax which will automatically generate pass/fail statements for each test instead of having to manually implement pass/fail print statements for each test case
I've been in a similar situation, and from what I gathered unit testing frameworks are NOT appropriate for integration testing on embedded systems. A similar question was asked here:
Test framework for testing embedded systems in Python
We personally use Google's OpenHTF to automate both the integration testing (as in your case) and the production verification testing, which includes bringup, calibration and overall verification of assembly.
Check it out: https://github.com/google/openhtf
We automate advanced test equipment such as RF switches and Spectrum Analysers all in Python in order to test and calibrate our RF units, which operate in the >500 MHz range.
Using OpenHTF, you can create complete tests with measurements very quickly. It provides you with a builtin web GUI and allows you to write your custom export 'callbacks'.
We're currently building a complete test solution for hardware testing. I'd be glad to help with OpenHTF if needed, as we're basing our flagship implementation on it.
This thread's old, but these suggestions might help someone...
For unit testing embedded C on a host PC and on target we use Unity and CMock. http://www.throwtheswitch.org/
For hardware-in-the-loop testing we use pytest.
Both work well and are part of our Jenkins release workflow.
I have a script that gets a file input plus some info, runs a couple of (possibly interdependent) programs on it using subprocess module, and distributes the output over the file-system.
Only a few parts can be tested in isolation by traditional unit-testing, so I'm searching a convenient way to automate the integration-testing (see if the output files exist in the right locations, in the right number, of the right size, etc).
I initially thought that setUp and tearDown methods from the default unittest module could help me, but they are re-run with each test, not once for the entire test suite, so it is not an option. Is there any way to make the unittest module run a global setUp and tearDown once? Or an alternative module/tool that I can use? Eclipse/PyDev integration would be a bonus.
I want my google app engine webapp2 app to start-up (create a new app instance) as quickly as possible. I was wondering what obvious slow downs I should watch out for (I know.. premature optimization, but I don't want to do a massive re-factor at the end if i can help it)
I have a folder hierarchy similar to this:
-root_folder
__init__.py
main.py
config.py
routes.py
models.py
gviz_api.py
... 20 more .py files
-web_folder
__init__.py
some_handlers.py
more_handlers.py
20 more.py files
..
-data_model_folder
__init__.py
some_models.py
more_ndb_models.py
10 more model files
-many more folders e.g. templates, simpleauth etc.
in main.py , I create an app instance with a router (the router is imported from routes.py). routes.py imports every single handler (assigning each route a handler). Every handler imports almost every datamodel. Will this mean my app is very slow to create a new app instance?
I'm expecting to have about 100 handlers and 30 data models by the end of my project, although many of them will be rarely used.
to import a data model (from inside some_handlers.py)
would just the following be fast enough:
from root_folder.data_model_folder.more_ndb_models import special_model
Should I be looking to use the config / registry ?
Webapp2 supports lazily-imported handlers.
Usually, slowdowns are due to importing large frameworks, not large amount of application code. So I wouldn't worry too much about this, even if you have 100 .py files. (Trust me, 100 is not that much...) I'd also look into warmup requests.
I'm not a big fan of lazy import tricks -- they can cause complex failure modes in edge cases (i.e. hard to debug), and they don't benefit from the extra lenience App Engine gives to loading requests (check your logs for what it considers a loading request).
In particular, if you don't import all your model classes at the start, you run the risk of getting "No model class found for kind 'X'" errors.
What is the latest way to write Python tests? What modules/frameworks to use?
And another question: are doctest tests still of any value? Or should all the tests be written in a more modern testing framework?
Thanks, Boda Cydo.
The usual way is to use the builtin unittest module for creating unit tests and bundling them together to test suites which can be run independently. unittest is very similar to (and inspired by) jUnit and thus very easy to use.
If you're interested in the very latest changes, take a look at the new PyCon talk by Michael Foord:
PyCon 2010: New and Improved: Coming changes to unittest
Using the built-in unittest module is as relevant and easy as ever. The other unit testing options, py.test,nose, and twisted.trial are mostly compatible with unittest.
Doctests are of the same value they always were—they are great for testing your documentation, not your code. If you are going to put code examples in your docstrings, doctest can assure you keep them correct and up to date. There's nothing worse than trying to reproduce an example and failing, only to later realize it was actually the documentation's fault.
I don't know much about doctests, but at my university, nose testing is taught and encouraged.
Nose can be installed by following this procedure (I'm assuming you're using a PC - Windows OS):
install setuptools
Run DOS Command Prompt (Start -> All Programs -> Accessories -> Command Prompt)
For this step to work, you must be connected to the internet. In DOS, type: C:\Python25\Scripts\easy_install nose
If you are on a different OS, check this site
EDIT:
It's been two years since I originally wrote this post. Now, I've learned of this programming principle called Designing by Contract. This allows a programmer to define preconditions, postconditions and invariants (called contracts) for all functions in their code. The effect is that an error is raised if any of these contracts are violated.
The DbC framework that I would recommend for python is called PyContract I have successfully used it in my evolutionary programming framework
In my current project I'm using unittest, minimock, nose. In the past I've made heavy use of doctests, but in a large projects some tests can get kinda unwieldy, so I tend to reserve usage of doctests for simpler functions.
If you are using setuptools or distribute (you should be switching to distribute), you can set up nose as the default test collector so that you can run your tests with "python setup.py test"
setup(name='foo',
...
test_suite='nose.collector',
...
Now running "python setup.py test" will invoke nose, which will crawl your project for things that look like tests and run them, accumulating the results. If you also have doctests in your project, you can run nosetests with the --with-doctest option to enable the doctest plugin.
nose also has integration with coverage
nosetests --with-coverage.
You can also use the --cover-html --cover-html-dir options to generate an HTML coverage report for each module, with each line of code that is not under test highlighted. I wouldn't get too obsessed with getting coverage to report 100% test coverage for all modules. Some code is better left for integration tests, which I'll cover at the end.
I have become a huge fan of minimock, as it makes testing code with a lot of external dependencies really easy. While it works really well when paired with doctest, it can be used with any testing framework using the unittest.TraceTracker class. I would encourage you to avoid using it to test all of your code though, since you should still try to write your code so that each translation unit can be tested in isolation without mocking. Sometimes that's not possible though.
Here is an (untested) example of such a test using minimock and unittest:
# tests/test_foo.py
import minimock
import unittest
import foo
class FooTest(unittest2.TestCase):
def setUp(self):
# Track all calls into our mock objects. If we don't use a TraceTracker
# then all output will go to stdout, but we want to capture it.
self.tracker = minimock.TraceTracker()
def tearDown(self):
# Restore all objects in global module state that minimock had
# replaced.
minimock.restore()
def test_bar(self):
# foo.bar invokes urllib2.urlopen, and then calls read() on the
# resultin file object, so we'll use minimock to create a mocked
# urllib2.
urlopen_result = minimock.Mock('urlobject', tracker=self.tracker)
urlopen_result.read = minimock.Mock(
'urlobj.read', tracker=self.tracker, returns='OMG')
foo.urllib2.urlopen = minimock.Mock(
'urllib2.urlopen', tracker=self.tracker, returns=urlopen_result)
# Now when we call foo.bar(URL) and it invokes
# *urllib2.urlopen(URL).read()*, it will not actually send a request
# to URL, but will instead give us back the dummy response body 'OMG',
# which it then returns.
self.assertEquals(foo.bar('http://example.com/foo'), 'OMG')
# Now we can get trace info from minimock to verify that our mocked
# urllib2 was used as intended. self.tracker has traced our calls to
# *urllib2.urlopen()*
minimock.assert_same_trace(self.tracker, """\
Called urllib2.urlopen('http://example.com/foo)
Called urlobj.read()
Called urlobj.close()""")
Unit tests shouldn't be the only kinds of tests you write though. They are certainly useful and IMO extremely important if you plan on maintaining this code for any extended period of time. They make refactoring easier and help catch regressions, but they don't really test the interaction between various components and how they interact (if you do it right).
When I start getting to the point where I have a mostly finished product with decent test coverage that I intend to release, I like to write at least one integration test that runs the complete program in an isolated environment.
I've had a lot of success with this on my current project. I had about 80% unit test coverage, and the rest of the code was stuff like argument parsing, command dispatch and top level application state, which is difficult to cover in unit tests. This program has a lot of external dependencies, hitting about a dozen different web services and interacting with about 6,000 machines in production, so running this in isolation proved kinda difficult.
I ended up writing an integration test which spawns a WSGI server written with eventlet and webob that simulates all of the services my program interacts with in production. Then the integration test monkey patches our web service client library to intercept all HTTP requests and send them to the WSGI application. After doing that, it loads a state file that contains a serialized snapshot of the state of the cluster, and invokes the application by calling it's main() function. Now all of the external services my program interacts with are simulated, so that I can run my program as it would be run in production in a repeatable manner.
The important thing to remember about doctests is that the tests are based on string comparisons, and the way that numbers are rendered as strings will vary on different platforms and even in different python interpreters.
Most of my work deals with computations, so I use doctests only to test my examples and my version string. I put a few in the __init__.py since that will show up as the front page of my epydoc-generated API documentation.
I use nose for testing, although I'm very interested in checking out the latest changes to py.test.