Optimizing Data-driven Tests through Equivalence Partitioning and Boundary Value Analysis

Optimizing Data-driven Tests through Equivalence Partitioning and Boundary Value Analysis


Automated GUI tests can be designed with configurability in mind, to offer the ability to execute tests with variant input data sets. In this article, I will present the concepts of equivalence partitioning and boundary value analysis. These concepts help to create data-driven tests that lead to rather high code coverage while keeping the data sets small.

Through concepts like data-driven testing, Squish users can easily execute tests with a potentially high number of data sets, but does a high number of data sets necessarily increase the quality of our application? Or do we keep executing the same code paths with different data sets, wasting time and resources? Furthermore, if we want to execute the test with a smaller number of input values, how should we decide which values to test with?

An inexperienced software tester might randomly select the test data. Product owners or customers deliver data files which they think are worth testing. A software tester with in-depth knowledge of the application’s source code can use a code coverage analysis tool like Squish Coco to judge on the quality of the test data. But even without such a tool, one can make rather good decisions when selecting test data by considering a bit of theory.


Let’s assume that we want to automate a GUI test for a form. A user may enter a date into the form, for example the validity date of a credit card. You can see an example of this in the image below.

The form offers two input fields, which the application validates after receiving user input. Technically, the user can enter any character into the input fields. To keep the example simple, we will focus on numeric input only. We can easily extend the test later to cover other invalid character classes.

The validation takes place once the user cursor leaves the input field. The application draws a red border around the input field if the user enters an invalid value. The form is supposed to accept years between 2000 and 2099 (remember the millenium bug?). The validity of the credit card itself would be done after the user clicks on the Send button, which is a different validation.

Which input data should we pick to test the input validation? Implementation-wise, there is likely no big difference between the number 2 and 3 into the month field. But entering 12 or 13 for the month of course definitely makes a difference. How can we address this in a structured way when selecting the test data? By combining the equivalence partitioning concept with a boundary value analysis.

Equivalence Partitioning

Equivalence partitioning aims at a small but efficient number of input values. This concept assumes that the application behavior is the same for each input value of a specific equivalence class.

So how do we find the equivalence classes for the month and year input fields? As a first step, we can write down the conditions for the validation:

C1: 1 <= month <= 12
C2: 2000 <= year <= 2099

This leads us to the following valid equivalence classes:

M1 = { month: 1 <= month <= 12 }
Y1 = { year: 2000 <= year <= 2099 }

Of course, there’s invalid input to consider as well. To cover these cases in negative tests, we can write down a couple of invalid equivalence classes:

M2 = { month : month < 1 }
M3 = { month : month > 12 }
Y2 = { year: year < 2000 }
Y3 = { year: year > 2099 }

The theory of equivalence partitioning tells us that we need to pick just one value from each equivalent class. This value tests the behavior of the application for the whole class. Using more values from the same class will not find new faults, because they are likely to execute the same code path. So, which values should we pick?

Boundary Value Analysis

Equivalence partitioning is typically used together with boundary value analysis. Through this analysis, we can select the most effective values. Let us take a look at how the partitions look for the month:

... -3 -2 -1 0 1 ..... 12 13 14 15 ...
  p1 (invalid) p2 (valid) p3 (invalid)

The boundary is the place between two partitions. It’s where the behavior of the application changes. This means that we should include values from each side of a boundary. For the month example shown above, these are the values 0, 1, 12, 13.

If we use the same method for the year, we find the values 1999, 2000, 2099, 2100. Thus, the following values should be used in a data-driven test.

months = { 0, 1, 12, 13 }
year = { 1999, 2000, 2099, 2100 }

Can we be sure that this really covers all cases? Since we are doing black box testing, we do not have knowledge of the implementation details. Even if we assume that we might have limited knowledge of the application’s source code (gray box testing), we might not be 100% sure how the validation is implemented.

For example, the partition p2 above might consist of two sub-partitions. A different branch of the source code might be executed for each sub-partition. For the simple case of a conditional statement for a month validation, this is rather unlikely. But one can think of other, more complex examples.


Equivalence partitioning and boundary value analysis can be helpful concepts. They help to choose the test data for data-driven tests and to avoid superfluous executions of the same code. This way we can save time and resources and increase the code coverage. However, for more complex input data, you should consider using a code coverage analysis tool like Squish Coco.


Leave a reply

Your email address will not be published. Required fields are marked *