Looking for testing/QA idea for Python Web Application Project

Looking for testing/QA idea for Python Web Application Project - python

I have the 'luck' of develop and enhance a legacy python web application for almost 2 years. The major contribution I consider I made is the introduction of the use of unit test, nosestest, pychecker and CI server. Yes, that's right, there are still project out there that has no single unit test (To be fair, it has a few doctest, but are broken).
Nonetheless, progress is slow, because literally the coverage is limited by how many unit tests you can afford to write.
From time to time embarrassing mistakes still occur, and it does not look good on management reports. (e.g. even pychecker cannot catch certain "missing attribute" situation, and the program just blows up in run time)
I just want to know if anyone has any suggestion about what additional thing I can do to improve the QA. The application uses WebWare 0.8.1, but I have expermentially ported it to cherrypy, so I can potentially take advantage of WSGI to conduct integration tests.
Mixed language development and/or hiring an additional tester are also options I am thinking.
Nothing is too wild, as long as it works.

Feather's great book is the first resource I always recommend to anybody in your situation (wish I had it in hand before I faced it my first four times or so!-) -- not Python specific but a lot of VERY useful general-purpose pieces of advice.
Another technique I've been happy with is fuzz testing -- low-effort, great returns in terms of catching sundry bugs and vulnerabilitues; check it out!
Last but not least, if you do have the headcount & budget to hire one more engineer, please do, but make sure he or she is a "software engineer in testing", NOT a warm body banging at the keyboard or mouse for manual "testing" -- somebody who's rarin' to write and integrate all sorts of automated testing approaches as opposed to spending their days endlessly repeating (if they're lucky) the same manual testing sequences!!!
I'm not sure what you think mixed language dev't will buy you in terms of QA. WSGI OTOH will give you nice bottlenecks/hooks to exploit in your forthcoming integration-test infrastructure -- it's good for that (AND for sundry other things too;-).

Automated testing seems to be as a very interesting approach. If you are developping a web app, you may be interested in WebDriver http://code.google.com/p/webdriver/

Since it is a web app, I'm wondering whether browser-based testing would make sense for you. If so, check out Selenium, an open-source suite of test tools. Here are some items that might be interesting to you:
automatically starts and stops browser instances on major platforms (linux, win32, macos)
tests by emulating user actions on web pages (clicking, typing), Javascript based
uses assertions for behavioral results (new web page loaded, containing text, ...)
can record interactive tests in firefox
can be driven by Python test scripts, using a simple communication API and running against a coordination server (Selenium RC).
can run multiple browsers on the same machine or multiple machines
It has a learning curve, but particularly the Selenium RC server architecture is very helpful in conducting automated browser tests.

Have a look at Twill, it's a headless web browser written in Python, specifically for automated testing. It can record and replay actions, and it can also hook directly into a WSGI stack.

Few things help as much as testing.
These two quotes are really important.
"how many unit tests you can afford to write."
"From time to time embarrassing mistakes still occur,"
If mistakes occur, you haven't written enough tests. If you're still having mistakes, then you can afford to write more unit tests. It's that simple.
Each embarrassing mistake is a direct result of not writing enough unit tests.
Each management report that describes an embarrassing mistake should also describe what testing is required to prevent that mistake from ever happening again.
A unit test is a permanent prevention of further problems.

Related

How to document existing Selenium Webdriver tests?

I am in charge of testing of a web application using Selenium Webdriver with Python. Over the past year I created a large script (20K+ lines) where each test is a separate function. Now my boss wants me to document my tests explaining in plan English what each test does. What tool would you recommend to document the steps your tests make?

I think this is a great question. Many people and companies don't bother managing their existing tests properly which leads to redundant and repeated code without having a clear idea what is covered by automated tests.
There is no single answer to this question but in general you can consider the following options:
Testing framework built in reporting. In Java, for example, you have the unit testing libraries like jUnit and TestNG. When they run, they generate certain output that can later be formatted and reviewed as the need arises. I am sure there an implementation of unit testing framework like this in Python too.
You can also consider using a BDD tool like Cucumber. This is a bit different and might not be suitable in certain cases when the tests are low level system checks. It can however help you organize your test scenarios and keep them an a readable form. It is also very good for reporting to a non-technical person.

How to properly unit test a web app?

I'm teaching myself backend and frontend web development (I'm using Flaks if it matters) and I need few pointers for when it comes to unit test my app.
I am mostly concerned with these different cases:
The internal consistency of the data: that's the easy one - I'm aiming for 100% coverage when it comes to issues like the login procedure and, most generally, checking that everything that happens between the python code and the database after every request remain consistent.
The JSON responses: What I'm doing atm is performing a test-request for every get/post call on my app and then asserting that the json response must be this-and-that, but honestly I don't quite appreciate the value in doing this - maybe because my app is still at an early stage?
Should I keep testing every json response for every request?
If yes, what are the long-term benefits?
External APIs: I read conflicting opinions here. Say I'm using an external API to translate some text:
Should I test only the very high level API, i.e. see if I get the access token and that's it?
Should I test that the returned json is what I expect?
Should I test nothing to speed up my test suite and don't make it dependent from a third-party API?
The outputted HTML: I'm lost on this one as well. Say I'm testing the function add_post():
Should I test that on the page that follows the request the desired post is actually there?
I started checking for the presence of strings/html tags in the row response.data, but then I kind of gave up because 1) it takes a lot of time and 2) I would have to constantly rewrite the tests since I'm changing the app so often.
What is the recommended approach in this case?
Thank you and sorry for the verbosity. I hope I made myself clear!

Most of this is personal opinion and will vary from developer to developer.
There are a ton of python libraries for unit testing - that's a decision best left to you as the developer of the project to find one that fits best with your tool set / build process.
This isn't exactly 'unit testing' per se, I'd consider it more like integration testing. That's not to say this isn't valuable, it's just a different task and will often use different tools. For something like this, testing will pay off in the long run because you'll have piece of mind that your bug fixes and feature additions aren't impacting your end to end code. If you're already doing it, I would continue. These sorts of tests are highly valuable when refactoring down the road to ensure consistent functionality.
I would not waste time testing 3rd party APIs. It's their job to make sure their product behaves reliably. You'll be there all day if you start testing 3rd party features. A big reason to use 3rd party APIs is so you don't have to test them. If you ever discover that your app is breaking because of a 3rd party API it's probably time to pick a different API. If your project scales to a size where you're losing thousands of dollars every time that API fails you have a whole new ball of issues to deal with (and hopefully the resources to address them) at that time.
In general, I don't test static content or html. There are tools out there (web scraping tools) that will let you troll your own website for consistent functionality. I would personally leave this as a last priority for the final stages of refinement if you have time. The look and feel of most websites change so often that writing tests isn't worth it. Look and feel is also really easy to test manually because it's so visual.

Performance between Django and raw Python

I was wondering what the performance difference is between using plain python files to make web pages and using Django. I was just wondering if there was a significant difference between the two. Thanks

Django IS plain Python. So the execution time of each like statement or expression will be the same. What needs to be understood, is that many many components are put together to offer several advantages when developing for the web:
Removal of common tasks into libraries (auth, data access, templating, routing)
Correctness of algorithms (cookies/sessions, crypto)
Decreased custom code (due to libraries) which directly influences bug count, dev time etc
Following conventions leads to improved team work, and the ability to understand code
Plug-ability; Create or find new functionality blocks that can be used with minimal integration cost
Documentation and help; many people understand the tech and are able to help (StackOverflow?)
Now, if you were to write your own site from scratch, you'd need to implement at least several components yourself. You also lose most of the above benefits unless you spend an extraordinary amount of time developing your site. Django, and other web frameworks for every other language, are designed to provide the common stuff, and let you get straight to work on business requirements.
If you ever banged out custom session code and data access code in PHP before the rise of web frameworks, you won't even think of the performance cost associated with a framework that makes your job interesting and eas(y)ier.
Now, that said, Django ships with a LOT of components. It is designed in such a way that most of the time, they won't affect you. Still, a surprising amount of code is executed for each request. If you build out a site with Django, and the performance just doesn't cut it, you can feel free to remove all the bits you don't need. Or, you can use a 'slim' python framework.
Really, just use Django. It is quite awesome. It powers many sites millions times larger than anything you (or I) will build. There are ways to improve performance significantly, like utilizing caching, rather than optimizing a loop over custom Middleware.

Depends on how your "plain Python" makes web pages. If it uses a templating engine, for instance, the performance of that engine is going make a huge difference. If it uses a database, what kind of data access layer you use (in the context of the requirements for that layer) is going to make a difference.
The question, thus, becomes a question of whether your arbitrary (and presently unstated) toolchain choices have better runtime performance than the ones selected by Django. If performance is your primary, overriding goal, you certainly should be able to make more optimal selections. However, in terms of overall cost -- ie. buying more web servers for the slower-runtime option, vs buying more programmer-hours for the more-work-to-develop option -- the question simply has too many open elements to be answerable.

Premature optimisation is the root of all evil.
Django makes things extremely convenient if you're doing web development. That plus a great community with hundreds of plugins for common tasks is a real boon if you're doing serious work.
Even if your "raw" implementation is faster, I don't think it will be fast enough to seriously affect your web application. Build it using tools that work at the right level of abstraction and if performance is a problem, measure it and find out where the bottlenecks are and apply optimisations. If after all this you find out that the abstractions that Django creates are slowing your app down (which I don't expect that they will), you can consider moving to another framework or writing something by hand. You will probably find that you can get performance boosts by caching, load balancing between multiple servers and doing the "usual tricks" rather than by reimplementing the web framework itself.

Django is also plain Python.
See the performance mostly relies on how efficient your code is.
Most of the performance issues of software arise from the inefficient code, rather than choice of tools and language. So the implementation matters. AFAIK Django does this excellently and it's performance is above the mark.

What methods and tools do you use to design and analyze the workflow in a web application (for a tiny team)

Note: I use TRAC integrated with SVN, framework testing tools, an excellent mixture of staging servers, development servers, and other tools to speed development and keep track of tasks.
I am asking about the specific process of design, and even more specifically, the design of functionality and flow in a web application.
--- Original question ---
So far I spend a lot of time with my text editor open basically talking to myself, then coding for a while, then talking to myself again. When there is more than just me we do some whiteboarding, but that's about it.
What do you find works well, specifically for projects for very small teams or one-developer shows.
BTW I usually develop with Django, this last project also involves RabbitMQ and Orbited, plus a fair chunk of jquery-assisted JavaScript.

Pencils!
Seriously, we use JIRA to track work/issue, Confluence to track requirements before they're baked enough to put in JIRA.. Anything to scratch together wire-frames, etc. (Including OmniGraffle, and Pencils).
I find the combination of JIRA and Confluence for tracking chunks of work and longer-lived concepts and standards pretty darned effective.

Another mainly solo developer here. I recommend using a bug/issue tracker. I use TRAC (although I'm looking at alternatives). It might seem strange, since you're the one creating, assigning, and closing all tickets, but it really helps to plan out/prioritize development. I also use the wiki to organize my thoughts on the roadmap, main goals of each release, etc.

+1 for a good question.
Web development is not much different to any other development.
P.20 of that venerable classic breaks a typical project down into 1/3 planning, 1/6 coding, 1/4 component test and 1/4 system test. I might catch some flak for that, YMMV and all, but that looks about right to me.
Whether or not you agree with the proportions, the message is "don't jump right in and code; think about it first (measure twice, cut once). How often have you jumped in and coded only to get near to "then end" and discover that you have a fundamental design flaw and have to through away much of your code & rewrite chunks more?
You need methodologies (Processes) and Tools, and each can necessitate the other.
Methodologies
Design it first! Gather requirements, make a system level design document then detailed design. If you are more than one, have these reviewed by someone (client can maybe review the high level docs?). If if you are alone, the simple act of writing it down forces you to slow down and think and will uncover problems. A good idea is to have a requirements traceability matrix to ensure that each requirement gets designed, implemented and tested somewhere.
After you review the high level design, you can being the detailed design, and after you review that, you can begin to implement. When the high level design is reviewed you can, in parallel, or later, produce a high level test spec. When the detailed level design is reviewed you can, in parallel, or later, produce unit test specs.
Note that test cases should be automated and should require no human interaction. Get into the habit of running regression tests after every code change - automate this if you can, with nightly build and or coupled with check-in to your version control system.
When everything is thoroughly unit tested, you can begin your system level testing.
Tools
At the very least you ought to be considering these:
A good IDE (WYSIWYG for web design), preferably with debugging capabilities, and it would be nice if it interfaced with your version control system, bug tracker, etc. A spill chuker is useful for websites ;-)
A project management tool to plan the project (Open Workbench does some nice Gantt charts)
A version control system.
A change request and bug tracker.
An automated test system.
An automated build system like Hudson (it may not seem relevant if you don't compile and link, but at least it can verify that all files exist and can schedule regression testing for you)
A backup system in case of disk crash, laptop loss, etc.
And if all that seems like too much "extra" to do, I was sceptical too once, until I saw that it actually saves time because you discover problems earlier when they cost less to fix. In fact, I am so sure of this that I do all of my personal one-man hobby projects this way. All of them.

Objective reasons for using Python or Ruby for a new REST Web API

So this thread is definitely NOT a thread for why Python is better than Ruby or the inverse. Instead, this thread is for objective criticism on why you would pick one over the other to write a RESTful web API that's going to be used by many different clients, (mobile, web browsers, tablets etc).
Again, don't compare Ruby on Rails vs Django. This isn't a web app that's dependent on high level frameworks such as RoR or Django. I'd just like to hear why someone might choose one over the other to write a RESTful web API that they had to start tomorrow, completely from scratch and reasons they might go from one to another.
For me, syntax and language features are completely superfluous. The both offer an abundant amount of features and certainly both can achieve the same exact end goals. I think if someone flips a coin, it's a good enough reason to use one over the other. I'd just love to see what some of you web service experts who are very passionate about their work respond to why they would use one over the other in a very objective format.

I would say the important thing is that regardless of which you choose, make sure that your choice does not leak through your REST API. It should not matter to the client of your API which you chose.

I know Ruby, don't know python... you can see which way I'm leaning toward, right?

Choose the one you're most familiar with and most likely to get things done with the fastest.

Yeah, flip a coin. The truth is that you're going to find minimalist frameworks in either language. Heroku is a pretty strong reason to say Ruby but there may be other similar hosts for Python. But Heroku makes it stupid easy to deploy your api into the cloud whether it's Rails or some other Ruby project that uses Rack. WSGI doesn't give you this option.
As for as the actually implementation though, I'm guessing that you'll find that they're both completely competent languages and both a joy to program in.

I think they are fairly evenly matched in features. I prefer Python, but I have been using it for over a decade so I freely admit that what follows is totally biased.
IMHO Python is more mature - there are more libraries for it (although Ruby may be catching up), and the included libraries I think are better designed. The language evolution process is more mature too, with each proposed feature discussed in public via the PEPs before the decision is made to include them in a release. I get the impression that development of the Ruby language is much more ad-hoc.
Python is widely used in a lot of areas apart from web development - scientific computing, CGI rendering pipelines, distributed computing, Linux GUI tools etc. Ruby got very little attention before Rails came along, so I get the impression that most Ruby work is focused on web development. That may not be a problem if that is all you want to do with the language, but it does mean that Python has a more diverse user base and a more diverse set of libraries.
Python is faster too.

Ruby + Sinatra
Very easy to use with/as rack middleware - someone's already mentioned heroku

Either will do a great job and you'll gain in other ways from learning something new. Why not spend as couple of days with each? See how far you can get with a simple subset of the problem, then see how you feel. For bonus points report back here and answer your own question!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.