Mutation Testing with Stryker in an Angularjs 1.6 App

In my current project I’m working on a large and messy Angular front-end. We are frequently making small changes and bugs keep creeping into the site. This had me wonder about some questions lately:

How do we make sure that your unit tests are testing the right things? And how do we find out if there is untestable code in the application or if there are tests that are always passing no matter what?

If you think the answer is measuring code coverage then I have to disappoint you: Code coverage only counts which lines of code are being executed while running the unit tests. It does not count which lines are actually tested let alone how thorough the tests are.

So how do we solve this problem? One of the potential answers is called mutation testing. What’s behind the interesting name? A mutation testing library takes the code that you are trying to test… and literally screws it up. It will for example remove function bodies and switch logical or binary operators for their counterpart. This is called mutation.

It will then run your tests against those destroyed versions of the code. Why does it do that? And what should happen? Each of the mutated pieces of code should at least fail one test. If all the tests still pass although a significant part of the code was altered then you know something is wrong with your tests. It’s actually kind of like a test suite for your test suite.

To illustrate the concept I’ve created a diagram: stryker_mutation_testing_principle To summarise: A tree of different so called mutations of our code is generated. Borrowing terms from Genetics, if all tests fail for a particular mutant, we say the mutant is “killed” by the test suite; on the opposite, if the tests passes, we say the mutant survived. Then our task has to be to “kill it” by writing a better or an additional test.

The concept of mutation testing itself has been around for decades but it used to be too computationally expensive to make practical use of it. Now with fast and parallel computing power at our hands we can give it another try… In Javascript among several mutation testing libraries there is one which is quite far developed: Stryker.

Stryker actually incorporates the technique of measuring code coverage to determine which unit tests should be executed against which tests.

Examples please!

Of course, right ahead… I wanted to give mutation testing a try but I couldn’t find any running example Angular applications with Stryker online, so I created one myself.

With a couple of hiccups due to the still thin documentation on the web Stryker was easy to install: adding a couple of dependencies to package.json and adding a stryker.conf.js to the project. In stryker.conf.js we define that we want to use Jasmine as a test framework and karma as a test runner.

Stryker will then import karma.conf.js to determine the test settings. An important thing to highlight is that the list of files to mutate is defined in  stryker.conf.js, but the list of files needed to execute the tests is being imported from  karma.conf.js.

So what does it look like when Stryker is run and how can we make its magic visible? Let’s run

node_modules/.bin/stryker run --logLevel debug stryker.conf.js

which will give us an actual output of what is mutated and how:

[DEBUG] ClearTextReporter - Mutant killed!
[DEBUG] ClearTextReporter - /Users/daten/Sites/AngularStrykerDemo/app.js: line 9:40
[DEBUG] ClearTextReporter - Mutator: BlockStatement
[DEBUG] ClearTextReporter - -               $ctrl.add = function (a, b) {
[DEBUG] ClearTextReporter - -                 var result = a + b;
[DEBUG] ClearTextReporter - -                 return result;
[DEBUG] ClearTextReporter - -               };
[DEBUG] ClearTextReporter - +               $ctrl.add = function (a, b) {
[DEBUG] ClearTextReporter - +   };
[DEBUG] ClearTextReporter - Ran all tests for this mutant.
[DEBUG] ClearTextReporter - Mutant killed!
[DEBUG] ClearTextReporter - /Users/daten/Sites/AngularStrykerDemo/app.js: line 10:27
[DEBUG] ClearTextReporter - Mutator: BinaryOperator
[DEBUG] ClearTextReporter - -                 var result = a + b;
[DEBUG] ClearTextReporter - +                 var result = a - b;
[DEBUG] ClearTextReporter -
[DEBUG] ClearTextReporter - Ran all tests for this mutant.
[DEBUG] ClearTextReporter - Mutant survived!
[DEBUG] ClearTextReporter -   /Users/daten/Sites/AngularStrykerDemo/app.js: line 15:27
[DEBUG] ClearTextReporter - Mutator: LogicalOperator
[DEBUG] ClearTextReporter - -                 var result = a && b;
[DEBUG] ClearTextReporter - +                 var result = a || b;

The first mutation that was created deleted the contents of our whole add method. The second mutation exchanges a plus operator for a minus in that same method. Luckily for both of these mutants at least one test failed. Hence the Mutant killed! output.

The third mutation – switching a logical and for an or – unfortunately was not caught by our unit tests which is indicated by the Mutant survived! output. We can fix this by adding the following test in app.spec.js (which is currently commented out):

it('should return false', function () {
  expect($ctrl.and(false, true)).toEqual(false);

Next time we run the mutation Stryker will recognise that the second test fails for this particular modification and so all will be good. No survivors.

To help you analyse the results of the mutation Stryker also gives you a handy summary in the form of an html page via the html reporter. And by the way a full list of the mutators that Stryker has built in can be found here.


I hope that I have convinced you that mutation testing is a great idea. But how practical is it to use mutation testing in a production app? Unfortunately I don’t have the answer to that. I myself have only tried it in a sample app so far. To integrate it in a productive way into a live app would be a much bigger effort and personally I am skeptical about the usefulness of doing so.

One problem that I see with using mutation testing in a large live application it is the increased time needed to execute all the tests. From the diagram above we can see that the time of execution for the test suite is multiplied by the amount of mutations made on the code. Every unit test is now not only run once but run once against every mutated version of the code.

Another problem that I see is the amount of time you’d have to spend to fix all your surviving mutations. Ideally this would result in great test quality and that’s what we are after. But in my own experience it is hard enough to get a team of developers to write a tests for every feature anyway. Getting them to also execute mutations and fix another layer of breaking tests might encounter a lot of resistance. But of course every project and team is different so I can’t speak for everyone.

However I think the concept is super interesting and can teach us a lot about how good tests are written. I really enjoyed my excursion into mutation testing and hope to someday use it in a way that goes beyond simple experiments. I’d be interested to hear any experiences using mutation testing in production if there are people out there!