Mutation Testing: Testing Your Tests

Francisco Moreno

QE Director

October 28, 2024

At the core of testing is trust. We write tests to ensure nothing breaks, but how can we trust tests that we've never seen fail? Will they catch mistakes when we mess up the code?

This is where mutation testing can be incredibly helpful. It’s a technique that evaluates the quality and effectiveness of your unit tests.

In essence, mutation testing makes small, controlled changes (mutations) to your application's source code, then runs your test suite against these "mutated" versions. If your tests are solid, they should catch these changes and fail. If they don't, it might indicate a weakness in your tests.

This technique allows us to test our own tests, adding an extra layer of security and confidence in our software development process.

Why is it important?

Test quality is a critical part of software development, but it’s often difficult to measure. Developers have traditionally relied on metrics like code coverage to assess the thoroughness of their tests. However, if we rely solely on this metric, we won’t get the full picture.

High code coverage only tells us that a portion of the code were executed during testing, but doesn’t give us information about the quality or effectiveness of those tests. It’s possible to have 100% coverage with tests that don’t actually verify the correct behavior of the code.

This is where mutation testing brings its value. By introducing deliberate changes to the code and checking whether the tests catch them, it provides a more accurate measure of how well our tests verify the code's behavior.

So, mutation testing is an excellent complement to code coverage.

A Closer Look

The mutation testing process can be broken down into several steps:

1- Mutant generation: Modified versions of the original source code are created. Each mutant contains a small change, such as swapping an arithmetic or logical operator, modifying a condition, etc.

2- Test execution: The test suite is run against each mutant.

3- Results analysis: It’s evaluated whether the tests caught the mutation (the mutant was "killed") or not (the mutant "survived").

The types of mutations can vary, but some common examples include:

Change arithmetic operators (e.g., + to -, * to /)
Modifying logical operators (e.g., AND to OR, > to >=)
Altering return values (e.g., return true to return false)
Removing lines of code

The result of mutation testing is often expressed as a "mutation score", which is the percentage of mutants that were caught (killed) by the tests.

One of the main advantages of mutation testing is that it can reveal subtle weaknesses in tests that other techniques might overlook. However, it does have drawbacks, such as potentially long run time and the need to correctly interpret the results.

Example

Let's look at an example with Java and the PIT mutation testing library.

Imagine we have a module that calculates the total price of the shopping cart, with special cases for discounts based on the total amount or specific items.

public class CarritoCompra {

    public static class Item {

        double precio;

        int cantidad;

        boolean descuentoAplicable;


        public Item(double precio, int cantidad, boolean descuentoAplicable) {

            this.precio \= precio;

            this.cantidad \= cantidad;

            this.descuentoAplicable \= descuentoAplicable;

        }

    }


    public double calcularTotal(List<Item> carrito) {

        double total \= 0;
  

        for (Item item : carrito) {

            double descuento \= item.descuentoAplicable ? item.precio \* 0.1 : 0;

            total += (item.precio \- descuento) \* item.cantidad;

        }


        if (total \> 100) {

            total \= total \* 0.9; // Aplicar 10% de descuento si el total es mayor a 100

        }


        return total;

    }

}

In an initial version, we might start with the following tests:

@Test

public void testCalculoTotalSinDescuentoGlobal() {

   CarritoCompra carritoCompra \= new CarritoCompra();

   List<CarritoCompra.Item> carrito \= Arrays.asList(

      new CarritoCompra.Item(20, 2, true),  // Precio: 20, Cantidad: 2, Descuento Aplicable

      new CarritoCompra.Item(50, 1, false)  // Precio: 50, Cantidad: 1, Sin descuento

   );

   double total \= carritoCompra.calcularTotal(carrito);

   assertEquals(86, total);

}


@Test

public void testCalculoTotalConDescuentoGlobal() {

  CarritoCompra carritoCompra \= new CarritoCompra();

  List<CarritoCompra.Item> carrito \= Arrays.asList(

    new CarritoCompra.Item(60, 2, false),  // Precio: 60, Cantidad: 2, Sin descuento

    new CarritoCompra.Item(50, 1, false)   // Precio: 50, Cantidad: 1, Sin descuento

  );

 double total \= carritoCompra.calcularTotal(carrito);

 assertEquals(153, total);  // Descuento global del 10% sobre 170

}

Both tests seem reasonable, and also give us 100% coverage. However, when we run the PIT tool and look at the report, we see that our tests could be improved.

What we observe next is that, despite having 100% coverage, we’re not considering the edge case where the total amount of the items is exactly 100 and, so the “mutant” survives.

To improve this, we should add a test case that considers this scenario.

@Test

public void testConDescuentoAplicable() {

   CarritoCompra carritoCompra = new CarritoCompra();

   List<CarritoCompra.Item> carrito = Arrays.asList(

           new CarritoCompra.Item(100, 1, false)  

   );

   double total = carritoCompra.calcularTotal(carrito);

   assertEquals(100, total);

}

Use Cases and Tools

Mutation testing can be implemented in various scenarios:

1- Continuous development: Developers can use mutation testing locally to improve their tests before committing code.

2- Continuous integration: Mutation testing can be integrated into CI/CD pipelines to ensure test quality is maintained over time.

3- Code review: Mutation testing results can be part of the acceptance criteria during code reviews.

4- Legacy code maintenance: Mutation testing can help identify areas where existing tests are insufficient in older codebases.

There are mutation testing tools for various programming languages. For example, PIT for Java, mutmut for Python, Stryker for JavaScript, and many others.

Conclusion

Mutation testing is a simple but highly useful tool for improving the quality of our unit tests. It helps us build confidence by exposing weaknesses that other metrics might miss.

Some recommendations for implementing it in existing projects include:

1- Starting in local environment or with IDE plugins.

2- Beginning with small, critical modules.

3- Gradually integrating it into the CI/CD process.

4- Using mutation testing results as a guide to improve test, not as a punitive metric.

5- Combining this technique with other code quality practices like code reviews and static analysis.

Remember: don't trust a test that you've never seen fail. Mutation testing helps ensure your tests will fail when they need to.

Francisco Moreno

QE Director

Francisco Moreno, QE Director at SNGULAR, leads the company's testing strategy and a team of professionals specialized in quality assurance in agile projects.

Previous Next

Our latest news

Interested in learning more about how we are constantly adapting to the new digital frontier?

Shai‑Hulud: The massive attack on npm that is shaking up the software supply chain

October 8, 2025

Shai‑Hulud: The massive attack on npm that is shaking up the software supply chain

Madrid pulses with the new era of Artificial Intelligence at the Google Cloud Summit 2025

May 26, 2025

Madrid pulses with the new era of Artificial Intelligence at the Google Cloud Summit 2025

Google launches its ultimate offensive in artificial intelligence from Cloud Next 2025

April 14, 2025

Google launches its ultimate offensive in artificial intelligence from Cloud Next 2025

Visualizing Software Quality with JIRA, Google Sheets, and Looker

March 19, 2025

Visualizing Software Quality with JIRA, Google Sheets, and Looker