Blog Artikel
Die Blog-Einträge sind nur in englischer Sprache verfügbar.
Automated testing topics : BIRT reporting for quality information over time
19.01.2012 15:38 von Alexandra Schladebeck
Background
Back in early 2011, when we were deciding where and how to split GUIdancer into two tools, we put a lot of consideration into what we wanted to achieve with our Open Source activities. We firmly decided that Jubula users should have all they need to write, execute and analyze tests. We wanted to have a good, well-rounded tool as our Open Source contribution, not something that is missing necessary features. Nevertheless, we did want to save some nice features for GUIdancer, the idea being that for a small amount of money, you can add some nice aspects to your testing project and process.
As it turns out, many of the features we kept closed source and as part of the commercial tool are things that are likely to become nice-to-haves once you’ve got more than just a few tests up and running. Once tests start getting bigger, then aspects like Test Style (like Checkstyle) and Mylyn integration really start getting interesting. And if you’re using Jubula in CI processes and your tests are gaining in importance, then it would be nice to have reporting capabilities (you know, for the managers ;) ) and to get some information on code coverage. If you’ve already used Jubula successfully in one project, then maybe you’ll think about starting testing even earlier next time with the Model-Based approach.
But these are things that come later, and we often get asked the question by newer users – what can I do with all of this? Well, in a short blog series, we’re going to describe how the various aspects of GUIdancer are designed to be used, and how they help us with our work in the Jubula project and in customer projects. This first entry will look at how we use BIRT reports in our daily, weekly and monthly work.
GUIdancer and BIRT – the basics
Jubula contains various options to analyze single test runs – a test result report appears dynamically during test execution, and individual test reports can be reopened including information on error-types and screenshots. This is indispensable in a test tool – I need to be able to see what went wrong so it can be fixed. However, what this doesn’t allow me is any kind of view over time. Questions such as:
- How many tests ran successfully / failed / didn’t run this week / month?
- How many tests have been added over the last period of time?
- How has the code coverage developed since adding the new tests?
can’t be answered with single test runs; they need to be cumulated and displayed in an easily readable manner. This is the aim of the BIRT integration in GUIdancer.
The existing BIRT reports
When we added the BIRT feature, we added some example reports that we imagined we would like to use and that may be useful for customers. New reports can be added by anyone who wants to, but it is nice to have a selection out-of-the box. Of the reports that we offer, there are three that we use very extensively. They have become such an integral part of our project that it’s worth describing how we use them and what they tell us.
Report 1: Comments
The report we use the most frequently is the “Comments” report. The first activity our team has every morning is to analyze the test results. As we have in excess of 70 GUIdancer / Jubula Test Suites (functional tests, performance tests, tests for our actions, tests for starting AUTs – all multiplied by a total of 5 platforms!), we wanted something that would help us to report the test results at our stand-up meeting. For any Test Suite that failed overnight, we use the “Edit Comment” function in the Reporting Perspective to add a short description of the reason for the failure to the test run. This also gives us a comparison over time and lets us find patterns in even sporadically occurring errors.
Once the comments have been entered, the “Comments” report is generated. The report displays a table of all failed Test Suites in a specified period of time (we choose “yesterday”) that don’t have the word “BROKEN” in their name (we have specific “BROKEN” Test Suites containing tests that show known, uncritical bugs). For each Test Suite, the comment is shown in the table. The report is printed and brought to the stand-up meeting so that any necessary action can be discussed.
Report 2: History and Coverage
The second report that we use on a regular basis shows the amount of expected Test Steps versus the amount of executed Test Steps as well as the Code Coverage value as three points for each test run. Again, by entering a time frame (we usually take 2 weeks for our weekly “Show&Tell” meeting, but sometimes look over a whole year for a better comparison), we can see how the tests are progressing. The report below shows a period of two months during which new tests are frequently added (the red line), the code coverage improves (the green line), and after an initial period of stability, tests often failed (the blue line). This pattern improved on the day the report was printed however (the last point on the graph).
This report is one of my favorite ones, because it can potentially show us so much. One the one hand, the amount of expected Test Steps can show how many tests have been added recently, and what sort of an effect it has had on the Code Coverage (I’m looking forward to writing the blog entry on Code Coverage, because it is fascinating to try and decide whether it actually tells us anything, but that’s another story…).
On the other hand, the amount of executed Test Steps can give us some hints about the type of errors we’ve been having:
- drops to almost 0 can either mean that something went terribly wrong at a central place (no database connection could be established for example), or that the error handling is bad (a small error resulted in the whole test stopping)
- a zig-zag graph is usually a good sign of test environment problems: sometimes the tests run, sometimes they don't. Test environment problems are particularly nasty - nobody really wants to feel responsible for them, yet they can be just as damaging as software errors or errors in the test specification.
- A flat graph means that nothing has been changed. No new tests, possibly no new versions of the software, and no new fixes. It's important to analyze why - in this example, it was simply because the tester was on holiday and there was no cover for him!
- Small dips in the executed test steps can either mean relatively uncritical errors, or a good error handling.
- Reports that never manage to get to 100% execution also tell us something about our reaction times. The status quo should always be that we are executing 100% of our expected tests – otherwise we have gaps in our knowledge. This means reacting quickly in the team, and it also means that our tests have to be able to keep up with the pace (I wrote this entry on test discipline a while ago). Errors will happen, so there should never be any questions about why 100% isn’t achieved constantly. But we can get a feeling for our general stability if we’re seeing too many drops that take too long to fix.
Report 3: Histogram
This report gets used to publish our test results on the Jubula website. It currently has a couple of advantages over the History report – it can show days when tests didn’t run at all (broken build, environment problems), and it also splits the Test Steps into successful / failed / not-executed. Otherwise, it shows a pretty similar view of the world to that of the History except that the colors are nicer :).
This first diagram shows tests that are generally stable, but no new steps are being added.
This diagram, on the other hand, shows a test for a problematic area of the software that our team decided make a priority. We wanted to add more tests and fix any issues they found quickly. This shows the last year. We've steadily been adding more and more tests, making them work by fixing bugs, then adding more tests. As you can see, at the time of writing, we're entering a stage of fixing ;)
Other reports
We tend to add new reports as we realize we need them to support our process. The possibilities are endless – percentage of static waits in the tests and test duration are just two that come to mind. At the moment though, we feel quite at home with the three I’ve described. What we’d like to hear is how you use reports in GUIdancer, if you do already. Contact us with your examples and receive a goodie to say thank you!
Until next time
The next entry in this series will be about Code Coverage, so stay tuned. In the meantime, happy testing!













