Insights into Automated Integration Testing
Thorough testing is one of the cornerstones of building robust and reliable software. Here's a glimpse at one of the testing strategies Avco employs.
At Avco, testing is one of the cornerstones of how we build robust software. We make use of several methodologies for testing, among these are;
- Unit testing - Where each part is individually tested prior to integration into the main project repository up to certain metrics (e.g. coverage and smoke test). Most modern languages have either built in or de-facto support libraries for unit testing.
- Manual system testing - QA engineers play the role of the end user by executing a written test plan. The testers will test the application’s features to ensure correct behavior through a set of vital use cases.
- Exploratory testing - A developer or user manually exercising new functionality. For example, as part of reviewing a merge request.
- Automated UI testing - Interacting with the system via the frontend UI in an automated manner. For example, for testing web applications we often use tools such as selenium.
- Automated integration testing - At Avco this means having a repeatable process that drives the system. The process checks that normal processes, or something very close, work as expected through a set of test cases. The intent is to both encode acceptance criteria and avoid regressions in the overall system.
Each of these has its place, and not all projects necessitate all of these. For example, a simple database-backed web site may not warrant selenium or automated integration testing, A backend processing system for business transactions may not be suitable for manual testing or UI testing, but full integration testing can provide much in the way of value.
Automated Integration Testing
In this post we are going to explore the automated integration testing, how it affects the design of the system under test and considerations on implementation.
The goals are primarily to reduce the effort required to test, as well as increasing the repeatability of the plan executions. Construction of such a system does cost time and money but it is offset by the benefits realised over time: reduced bugs, increased ease of subsequent maintenance/development.
As with any approach, it is not necessarily the best for all situations. Some systems may warrant considering another approach;
- Small, non-interactive systems - So long as you have a good level of unit testing this is unlikely to add significant value.
- Websites - If you want end-to-end tests for a web application you might be better off with a webdriver approach. There is also the opportunity to integrate such technologies into a test system, such as those described here, but the amount of benefit derived will vary depending on the project.
- GUI applications - You are likely to be better off with a GUI automation suite to drive the application or making more logic accessible to unit tests. If it is an old application with a lot of business logic this would probably be an exception if it can not be decomposed.
This example was built for the A2C data standards system. By way of background, our A2C data standards service handles A2C messages for awarding organisations (previously known as exam boards or awarding bodies), acting as a proxy between the management information systems for centres (schools or colleges) and the awarding organisation systems. Centres submit orders in A2C format and get feedback on each individual transaction, telling them what was wrong or needs correcting depending on the type of product (qualification/exam) ordered and learner in question. On the awarding organisation side, we expose a set of web services that hook into custom integrations. This can be file drops, push or pull web service calls or a mixture of multiple methods. In between, it validates according to the A2C specification and configured product catalogue; the list of qualifications and business rules provided by the awarding organisation.
Of course, we had unit tests for individual processing steps, but we also needed repeatable validation that the system was working and that assumptions continued to be valid in the face of future changes. During development we were faced with multiple moving targets; the A2C specification, the data model, and customer needs. We needed to ensure the ongoing operation of the system despite these complications.
How did we go about achieving this? A2C is a message processing system. Messages from centres are processed in order, including the backend processing, then feedback is returned asynchronously. Once one message from a centre is deemed complete, the next becomes eligible for processing.
A client was constructed to send the input for the test cases over the transport protocol, wait a while and poll for responses. The contents were verified against the expected feedback once these were received.
We used real A2C messages and feedback to define the test cases. Tooling was constructed to verify the equivalence of two sets of feedback messages. This was necessary due to the order and grouping of messages returned being an implementation detail rather than prescribed functionality (it was as intended, but we did not want to include it in these test cases).
On the back end we constructed an accept-all stub, or ‘yes box’, to simulate the awarding organisation system. This would accept all valid orders (as validated by the main system) with several caveats to assist in testing. For example, reject all learners born on a certain date or orders for products with a code starting with ‘NO’.
Initially we used a product catalogue provided by one of the awarding organisations, however it soon turned out that a lot of the data is time-sensitive. For example, products may only be available to learners between certain dates. These times of availability do not overlap for different types of orders either, which means using the current time for validation is not practical.
We ended up building our own product catalogue, just to exercise the system. It contained ranges of qualifications with dates extending 100 years each way. This meant the tests would continue to work for the foreseeable future and, probably, longer than the lifetime of the system.
For generating suitable test data that would come from centres, such as learner details and the associated identifiers, we utilised test data we produced for another project which provided us with realistic-enough but completely fake personal information suitable for use in test. This was also useful during manual and unit testing to have realistic looking data.
These together meant we had a solid foundation for continuing to iterate on the system while reducing the risk of regression. The main lessons to learn are;
- Make the input and output suitable for automation - In the case of A2C it was calling a web service and we mocked the backend processing, by creating a stub awarding organisation, so with some work this was repeatable.
- Provide a way to clear down ‘instance’ data - Data that is local to a customer/organisation that may influence future transactions. For example, in A2C you, often, were not allowed to enter the same learner multiple times for the same exam in the same session.
- Make the tests repeatable - Clearing down data prior to running the tests made them repeatable. This can take the form of a fresh database for each test (which allows for inspecting the state of failed tests), or deleting data relating to a given test scenario.
- Use realistic data - Use something close to your exchange formats to run the end to end tests. This exercises the full system, including the outer edges, and makes it easier to take a found bug and convert it into a test case. Conversion can be done either by reproducing it against our test product catalogue or adding features to the test catalogue to allow us to hit the issue.
- Other types of testing are a resource - You can use issues found during integration testing to point out unit tests that should exist but may have been missed. For example, maybe it violates an assumption of an internal API, so there should be unit tests on all such modules to verify they comply with wider expectations of the system.
We have used this technique with success in many larger systems and while a bespoke test environment can be expensive in time and effort, it means our system tests are repeatable and possible to run. Many of the cases involved specific sequences of events that were difficult or error-prone to set up by hand so there was quite a lot of effort spent both building the test cases and trying to verify the intended behaviour manually.