Pitfalls of Test Data Sourcing (and how to avoid them)

Your verification process is only as good as the data you put into it. Over the years, we've seen a lot of development teams stunted by failures in the breadth, depth, and dynamism of their test data. Here are some of the most common pitfalls.

Creating a "Gold Standard" or Baseline set, and Never Looking Back

Data Smell:

"Why this data? Because it's the data we've always used"

"Janice hand-curated this data last year, and she's an expert in the field"

"We used this test data for V&V on a similar product we developed in the past"

"Sure, it's missing some of our use cases, but we can cover them with ad hoc testing"

One of the key shortcomings of the traditional informatics testing approach is the reliance on a static set of test data. In traditional testing, the barriers to adding or modifying test inputs and outcome expectations (SME time-cost, verification script development, and manual curation) are high enough that this activity typically happens just once in a product development lifecycle and with a restrictive scope.

Those familiar with the product development and sustaining lifecycle will be aware that relying only on a limited and static set of test data exposes your product to risks:

overfitting to initial available data
un-covered edge cases and defects being exposed in the field
un-covered feature paths as development continues

Testing teams will often find themselves supplementing a sparse baseline/automation test set with ad hoc analyses and spot checks.

At Miqa, we believe the Test Body should be a living, growing thing. As new datasets become available, or as new information becomes available about an existing dataset (a new truth, or even a redaction of a previously characterized truth), these should be seamlessly integrated into your verification framework so that you always have your best possible shot at achieving accuracy, reproducibility, and robustness in your application. Read more about the elements of a truly effective Test Body here.

Conveniently Successful Test Data

Data Smell:

"Precision and recall is at 100% on all of our test datasets"

"We can't include issue-linked datasets in our test set because it will impact the overall stats"

Realistic, challenging, and defect-exposing datasets are crucial elements of your Test Body. They allow you to monitor how performance changes over time as bugs are fixed and features change.

Failures in challenge or known-issue related data don't need to impact your top-line metrics. Using datasource tagging, characterizing, and filtering, these can be separately reported on or included in summary metrics, based on your needs.

Not Testing Problematic Data after Issue Resolution

Data Smell:

"We don't need to keep testing that dataset, we fixed that bug 2 versions ago"

"We already have a test dataset that shows that bug, we don't need to include this new one"

Previously identified defects should continue to be monitored even after being fixed or deemed as accepted known issues. Tangential or unrelated development has the potential to have knock-on effects and "un-fix" a previously fixed bug or change performance. In addition, a too-targeted bugfix or patch (usually stopgaps or hotfixes to solve a bug, with the forgotten plan to) may not resolve the same error mode under different conditions.

Re-appearance of a supposedly "solved" issue in the field can cause considerable confusion, lost time, and seriously hurt a development team's credibility. Prevent surprises by regression testing for solved issues with robust issue-linked data.

Benchmarking & Consistency Reporting

Building a Body of Test Data

Creating Custom Data Tests