I am reproducing a spreadsheet in python. The spreadsheet contains the data and the processing logic on every Monday not on the rest weekdays.
I want to run the python code on everyday, if it is Monday, I want to compare the python result with the spreadsheet result. I have 20+ tests spread across the python code doing the comparisons.The tests include: 1) comparing data that I got from production database is the same as in the excel 2) comparing the python produces the same results as excel(the logic is the same) if the inputs are the same.
How can I turn on the test for Monday, without inserting 20+ "if Monday: run test_n" to the python code?
I don't think I can separate the test and the source code, since later tests takes inputs from previous processing steps.
It looks like you have a limited number of choices.
You could refactor your code to pull the tests together to activate them with fewer if tests. You say that may not be possible, but it seems to me that you should try to do that first. You recognize there is a smell in your code, so you should try some of the refactoring techniques to succeed in separating the test and the source code. Check some of the techniques--there are many books and web sites that discuss some of them.
You could leave your code as is. This will build up technical debt but that may be necessary. Use the over 20 if statements and comment them well to they can be found and modified later if needed. At least do the date check only once in your code, set a Boolean variable, and test that variable rather than redoing the date check.
Without more detail I do not see how we could offer any other options.
If these are tests in the "make sure it works" sense, they should not be in the production code. They should be wholly separate in a test suite.
Testing code is a very broad topic, but here's a few resources to get you started.
Writing unit tests in Python, where do I start?
The Python unittest library.
Improve Your Python: Understanding Unit Testing
Python Software Development and Software Testing (posts and podcast)
I don't think I can separate the test and the source code, since later tests takes inputs from previous processing steps.
You absolutely can, every system does, but it may require redesigning your system. This is a common chicken-and-egg problem for legacy code: how do you change it safely if you can't test it? And there are various techniques for dealing with that. Refactoring, the process of redesigning code without changing how it works, will feature prominently. But without details I can't say much more.
1) comparing data that I got from production database is the same as in the excel
2) comparing the python produces the same results as excel(the logic is the same) if the inputs are the same.
Rather than testing inside your code, you should be testing its outputs.
Both of these should be a matter of converting the output of the various processes into a common format which can then be compared. This could be dumping them as JSON, turning them all into Python data structures, CSVs... whatever is easiest for your data. Then compare them to ensure they're the same.
Again, without more detail about your situation I can't offer much more.
Related
I am in charge of testing of a web application using Selenium Webdriver with Python. Over the past year I created a large script (20K+ lines) where each test is a separate function. Now my boss wants me to document my tests explaining in plan English what each test does. What tool would you recommend to document the steps your tests make?
I think this is a great question. Many people and companies don't bother managing their existing tests properly which leads to redundant and repeated code without having a clear idea what is covered by automated tests.
There is no single answer to this question but in general you can consider the following options:
Testing framework built in reporting. In Java, for example, you have the unit testing libraries like jUnit and TestNG. When they run, they generate certain output that can later be formatted and reviewed as the need arises. I am sure there an implementation of unit testing framework like this in Python too.
You can also consider using a BDD tool like Cucumber. This is a bit different and might not be suitable in certain cases when the tests are low level system checks. It can however help you organize your test scenarios and keep them an a readable form. It is also very good for reporting to a non-technical person.
I'm teaching myself backend and frontend web development (I'm using Flaks if it matters) and I need few pointers for when it comes to unit test my app.
I am mostly concerned with these different cases:
The internal consistency of the data: that's the easy one - I'm aiming for 100% coverage when it comes to issues like the login procedure and, most generally, checking that everything that happens between the python code and the database after every request remain consistent.
The JSON responses: What I'm doing atm is performing a test-request for every get/post call on my app and then asserting that the json response must be this-and-that, but honestly I don't quite appreciate the value in doing this - maybe because my app is still at an early stage?
Should I keep testing every json response for every request?
If yes, what are the long-term benefits?
External APIs: I read conflicting opinions here. Say I'm using an external API to translate some text:
Should I test only the very high level API, i.e. see if I get the access token and that's it?
Should I test that the returned json is what I expect?
Should I test nothing to speed up my test suite and don't make it dependent from a third-party API?
The outputted HTML: I'm lost on this one as well. Say I'm testing the function add_post():
Should I test that on the page that follows the request the desired post is actually there?
I started checking for the presence of strings/html tags in the row response.data, but then I kind of gave up because 1) it takes a lot of time and 2) I would have to constantly rewrite the tests since I'm changing the app so often.
What is the recommended approach in this case?
Thank you and sorry for the verbosity. I hope I made myself clear!
Most of this is personal opinion and will vary from developer to developer.
There are a ton of python libraries for unit testing - that's a decision best left to you as the developer of the project to find one that fits best with your tool set / build process.
This isn't exactly 'unit testing' per se, I'd consider it more like integration testing. That's not to say this isn't valuable, it's just a different task and will often use different tools. For something like this, testing will pay off in the long run because you'll have piece of mind that your bug fixes and feature additions aren't impacting your end to end code. If you're already doing it, I would continue. These sorts of tests are highly valuable when refactoring down the road to ensure consistent functionality.
I would not waste time testing 3rd party APIs. It's their job to make sure their product behaves reliably. You'll be there all day if you start testing 3rd party features. A big reason to use 3rd party APIs is so you don't have to test them. If you ever discover that your app is breaking because of a 3rd party API it's probably time to pick a different API. If your project scales to a size where you're losing thousands of dollars every time that API fails you have a whole new ball of issues to deal with (and hopefully the resources to address them) at that time.
In general, I don't test static content or html. There are tools out there (web scraping tools) that will let you troll your own website for consistent functionality. I would personally leave this as a last priority for the final stages of refinement if you have time. The look and feel of most websites change so often that writing tests isn't worth it. Look and feel is also really easy to test manually because it's so visual.
I recently stumbled over this (aged) article:
http://imranontech.com/2007/01/04/unit-testing-the-final-frontier-legacy-code/
where the author allegedly wrote a perl script to automatically generate test cases.
His strategy went like this (cited):
Read in the header files I gave it.
Extracted the function prototypes.
Gave me the list of functions it found and let me pick
which ones I wanted to create unit tests for.
It then created a dbx
(Solaris debugger) script which would break-point every time the
selected function was called, save the variables that were passed to
it and then continue until the function returned at which point it
would save the return value.
Run the executable under the dbx
script, and which point I proceeded to use the application as
normal, and just ran through lots of use cases which I thought would
go through the code in question and especially cases where I thought
it would hit edge cases in the functions I want to create unit tests
for.
The perl script then took all of the example runs, stripped out
duplicates, and then autogenerated a C file containing unit tests
for each of the examples (i.e pass in the input data and verify the
return value is the same as in the example run) Compiled/Linked/Ran
the unit tests and threw away ones which failed (i.e. get rid of
inputs which cause the function to behave non-deterministically)
I have a lot of legacy code of all kinds in the languages Python and Fortran. The article is from 2007. Is there anything like this implemented in current Unit testing frameworks?
How would i go about writing such a script?
Very C-like. Also, OS dependent, I think (Solaris debugger)? I'd say you should look at "record/capture and playback" tools, though somehow I think the "generate" part never really took off.
Python's testing tools taxonomy would be a great place to start. I'd say you either record your way through application using Selenium or Dogtail. The link takes you right to that section, Web testing tools, but check others as well: fuzzy testing is a technique similar to Golden Master, which sometimes may help with legacy apps, and is a "record / playback" technique. Feathers calls such tests "characterization" test, for they characterize legacy system's behaviours.
Very good point in article you cite:
Have a look at your own source code repository and see which
functions/classes have had the most bugfix checkins applied, 80% of
bugfixes tend to be made to about 20% of the code. There’s sound logic
behind this – often that 20% of the code is poorly written with dozens
or hundreds of “special case” hacks.
This is where I'd actually start. Have you got these parts identified? Simple Git/SVB log usage scripts and coverage tools section from the taxonomy would come in handy with this.
Unfortunately more than that I can't help you - my Python experience is limited and Fortran - non-existing.
I've been working on a fairly large Python project with a number of tests.
Some specific parts of the application require some CPU-intensive testing, and our approach of testing everything before commit stopped making sense.
We've adopted a tag-based selective testing approach since. The problem is that, as the codebase grows, maintaining said tagging scheme becomes somewhat cumbersome, and I'd like to start studying whether we could build something smarter.
In a previous job the test system was such that it only tested code that was affected by the changes in the commit.
It seems like Mighty Moose employs a similar approach for CLR languages. Using these as inspiration, my question is, what alternatives are there (if any) for smart selective testing in Python projects?
In case there aren't any, what would be good initial approaches for building something like that?
The idea of automating the selective testing of parts of your application definitely sounds interesting. However, it feels like this is something that would be much easier to achieve with a statically typed language, but given the dynamic nature of Python it would probably be a serious time investment to get something that can reliably detect all tests affected by a given commit.
When reading your problem, and putting aside the idea of selective testing, the approach that springs to mind is being able to group tests so that you can execute test suites in isolation, enabling a number of useful automated test execution strategies that can shorten the feedback loop such as:
Parallel execution of separate test suites on different machines
Running tests at different stages of the build pipeline
Running some tests on each commit and others on nightly builds.
Therefore, I think your approach of using tags to partition tests into different 'groups' is a smart one, though as you say the management of these becomes difficult with a large test suite. Given this, it may be worth focussing time in building tools to aid in the management of your test suite, particularly the management of your tags. Such a system could be built by gathering information from:
Test result output (pass/fail, execution time, logged output)
Code coverage output
Source code analysis
Good luck, its definitely an interesting problem you are trying to solve, and hope some of these ideas help you.
I guess you are looking for a continuous testing tool?
I created a tool that sits in the background and runs only impacted tests: (You will need PyCharm plugin and pycrunch-engine from pip)
https://github.com/gleb-sevruk/pycrunch-engine
This will be particularly useful if you are using PyCharm.
More details are in this answer:
https://stackoverflow.com/a/58136374/2377370
If you are using unittest.TestCase then you can specify which files to execute with the pattern parameter. Then you can execute tests based on the code changed. Even if not using unittest, you should have your tests are organsied by functional area/module so that you can use a similar approach.
Optionally, not an elegant solution to your problem but if each developer/group or functional code area was committed to a separate branch, you could have it executed on your Continuous Testing environment. Once that's completed (and passed), you can merge them into your main trunk/master branch.
A combination of nightly jobs of all tests and per-branch tests every 15-30 minutes (if there are new commits) should suffice.
A few random thoughts on this subject, based on work I did previously on a Perl codebase with similar "full build is too long" problems:
Knowing your dependencies is key to having this work. If module A is dependent on B and C, then you need to test A when either of then is changed. It looks like Snakefood is a good way to get a dictionary that outlines the dependencies in your code; if you take that and translate it into a makefile, then you can simply "make test" on check in and all of the dependencies (and only the needed ones) will be rebuilt and tested.
Once you have a makefile, work on making it parallel; if you can run a half-dozen tests in parallel, you'll greatly decrease running time.
If you write the test results to file you can then use make or an similar alternative to determine when it needs to "rebuild" the tests. If you write results to the file, make can compare the date time stamp of the tests with the dependant python files.
Unfortunately Python isn't too good at determining what it depends on, because modules can be imported dynamically, so you can't reliably look at imports to determine affected modules.
I would use a naming convention to allow make to solve this generically. A naive example would be:
%.test_result : %_test.py
python $< > $#
Which defines a new implicit rule to convert between _test.py and test results.
Then you can tell make your additional dependencies for you tests, something like this:
my_module_test.py : module1.py module2.py external\module1.py
Consider turning the question around: What tests need to be excluded to make running the rest tolerable. The CPython test suite in Lib/test excludes resource heavy tests until specifically requested (as they may be on a buildbot). Some of the optional resources are 'cpu' (time), 'largefile' (disk space), and 'network' (connections). (python -m test -h (on 3.x, test.regrtest on 2.x) gives the whole list.)
Unfortunately, I cannot tell you how to do so as 'skip if resource is not available' is a feature of the older test.regrtest runner that the test suite uses. There is an issue on the tracker to add resources to unittest.
What might work in the meantime is something like this: add a machine-specific file, exclusions.py,containing a list of strings like those above. Then import exclusions and skip tests, cases, or modules if the appropriate string is in the list.
We've run into this problem a number of times in the past and have been able to answer it by improving and re-factoring tests. You are not specifying your development practices nor how long it takes you to run your tests. I would say that if you are doing TDD, you tests need to run no more than a few seconds. Anything that runs longer than that you need to move to a server. If your tests take longer than a day too run, then you have a real issue and it'll limit your ability to deliver functionality quickly and effectively.
Couldn't you use something like Fabric? http://docs.fabfile.org/en/1.7/
I am developing some Python modules that use a mysql database to insert some data and produce various types of report. I'm doing test driven development and so far I run:
some CREATE / UPDATE / DELETE tests against a temporary database that is thrown away at the end of each test case, and
some report generation tests doing exclusively read only operations, mainly SELECT, against a copy of the production database, written on the (valid, in this case) assumption that some things in my database aren't going to change.
Some of the SELECT operations are running slow, so that my tests are taking more than 30 seconds, which spoils the flow of test driven development. I can see two choices:
only put a small fraction of my data into the copy of the production database that I use for testing the report generation so that the tests go fast enough for test driven development (less than about 3 seconds suits me best), or I can regard the tests as failures. I'd then need to do separate performance testing.
fill the production database copy with as much data as the main test database, and add timing code that fails a test if it is taking too long.
I'm not sure which approach to take. Any advice?
I'd do both. Run against the small set first to make sure all the code works, then run against the large dataset for things which need to be tested for time, this would be selects, searches and reports especially. If you are doing inserts or deletes or updates on multiple row sets, I'd test those as well against the large set. It is unlikely that simple single row action queries will take too long, but if they involve a lot alot of joins, I'd test them as well. If the queries won't run on prod within the timeout limits, that's a fail and far, far better to know as soon as possible so you can fix before you bring prod to it's knees.
The problem with testing against real data is that it contains lots of duplicate values, and not enough edge cases. It is also difficult to know what the expected values ought to be (especially if your live database is very big). Oh, and depending on what the live application does, it can be illegal to use the data for the purposes of testing or development.
Generally the best thing is to write the test data to go with the tests. This is labourious and boring, which is why so many TDD practitioners abhor databases. But if you have a live data set (which you can use for testing) then take a very cut-down sub-set of data for your tests. If you can write valid assertions against a dataset of thirty records, running your tests against a data set of thirty thousand is just a waste of time.
But definitely, once you have got the queries returning the correct results put the queries through some performance tests.