From TDD to PBT via Kotest
I’ve been a big fan of Property Based Testing for a number of years, based on my experiences with ScalaCheck. It’s always been an annoyance that Kotlin did not support this testing style, at least to the same extent. There was some functionality in Kotest (formerly KotlinTest), but it paled in comparison to what was available in Scala, F# and Python.
Fortunately, the support in Kotest has come a long way, to the point where adopting PBT is Kotlin is now perfectly viable. In this article we will take a simple kata and convert the test suite from regular Unit Tests to Property Based Tests.
The Sample Problem
Here’s an interview question I saw posted on Twitter some time ago. Given a string such as aaabbcccc produce a list of Tuples such that each subsequence is represented by the character and its length. In this case the answer would be [(‘a’, 3), (‘b’, 2), (‘c’, 4)].
When I sat down to try this here are the tests I wrote, via the standard TDD loop:
This was my corresponding implementation, which passes the tests.
It was only on reflection that I realised this solution has a major flaw. It cannot handle an empty string.
This didn’t occur to me because I had Bounded Context Blindness — I was focused solely on solving the cases listed in their original tweet and didn’t consider the bigger picture.
Although this is only a toy problem, it illustrates an important issue. As developers we naturally focus on the trees and not the forest. No matter how conscientious we might be, everyone wants to be done and home by five. This is why pair (and mob) programming is so valuable and why professional testers are a necessity. It is also the problem addressed by PBT.
PBT To The Rescue
In Property Based Testing we do not supply input and outputs. Instead we provide true / false statements which can be verified against (semi) randomly generated inputs.
There are two challenges with this style of testing:
- Thinking up the properties themselves
- Ensuring we have appropriate generators
In this case we can observe that the length of the string should always be equal to the totals of all the pairs. In other words given the input aaabbcccc and the result [(‘a’, 3), (‘b’, 2), (‘c’, 4)] we could verify that 9 is equal to 3 + 2 + 4.
Here’s the test:
As well as validating the scenarios we intended this will also cause the NullPointerException. This is because the empty string is a known edge case always included by the String generator.
Here’s the error that results:
Property failed after 23 attempts
Repeat this test by using seed -9194171543674236027
Caused by StringIndexOutOfBoundsException: String index out of range: 0
java.lang.AssertionError: Property failed after 23 attempts
A simple addition to the implementation allows the test to pass:
This is the strength of PBT. Because the testing tool has no knowledge of our problem domain it is very good at finding the Unknown Unknowns in our code. Those who try out PBT on an existing codebase invariably report that the tool finds errors they had never considered. Because you only write tests for the scenarios you can think of.
An Additional Property
The biggest danger with the PBT approach is that you end up comparing the solution to itself. Because the inputs are randomly generated you need a way to infer what the right answer should be, and that method of inference must be different from the implementation under test.
Some standard solutions are:
- Test invariants in the code. This is what we have just done.
- Round trip the data. For example marshalling to and from JSON.
- Use an alternative implementation that is known to work, but cannot be applied in production. Perhaps because it is too slow, relies on deprecated technologies or is ruled out by legal / licensing considerations.
- Exploit existing research regarding the problem domain. For example if we were writing code to recognise poker hands we could check that the frequency with which random hands were identified matched the known probabilities.
To illustrate this here’s another property we could use in our current problem. We can write and run a Regular Expression that matches when a character is followed by a character other than itself:
This is an alternative implementation we could have used, and has the virtue of exploiting existing research (we can find the regex online).
Now we can test that the number of matches for the Regex is equal to the number of Pairs in the list.
Combining and TDD and PBT
Now that PBT is well supported in Kotest we can decide how best to combine it with existing techniques. Here are three possible options:
- Use PBT simply to find the Unknown Unknowns. So continue with TDD as usual, but write many “this function never throws” type properties. For many companies this seems to be the gateway drug to PBT.
- Write both conventional tests and PBT properties. Include the scenarios discovered via PBT in your conventional test suite. This is the best of both worlds option and the one we would encourage.
- Stop doing TDD and rely on the PBT approach to find errors in your code. As edge cases are discovered write custom generators to ensure they are included in every test run. You might call this the purist PBT approach.
A Remaining Worry
Let’s say you put your faith in PBT and build your test suite around properties. After a while you would be bothered by an additional doubt. How do you know that the built in generators are creating a sufficiently diverse set of test cases?
In this case the String generator is creating strings of arbitrary length and content. But we have to put an upper limit on the number of values generated. So how do we know that enough of these contain substrings of the same characters?
There are several approaches we could take:
- Increase the number of test cases, or simply have the testing tool run continuously within our CI/CD pipeline.
- Some PBT tools allow you to design and apply metrics. These can be used to ensure the generated values meet whatever criteria you care about.
- Write a bespoke generator that produces exactly the kind of data we need.
Here’s an example of the third approach:
The code works as follows:
- First we declare a list of ranges of characters. These will be the characters produced from our custom generator.
- We use the char generator to select randomly from these ranges.
- Then each character is transformed into a sequence of itself via the int generator. We use flatMap to ensure that the output is a single generator, as opposed to a list of generators.
- This repeatedCharGen is then embedded in the list generator. This gives us a generator that can produce a list of between zero and one hundred strings, each of which consists of a single character repeated one to ten times.
- Finally, we map over the list to convert it into a single String
Let’s write a dummy test so we can examine the output:
As you can see the output contains precisely the kind of test values we want. Including the empty string:
Having proved the generator works, we can adapt our original properties to use it:
We have demonstrated that PBT in Kotlin is now both achievable and worthwhile. There are a wide range of generators available for built-in types. But more importantly it is straightforward to compose and extend these generators using standard FP techniques.
If you are new to PBT I hope this article has inspired you to have a closer look. In my view the best book on PBT is the ScalaCheck book from Artima. This contains detailed advice of writing properties and the examples can easily be ported to other tools. But you should first start with this superb talk by John Hughes, the inventor of PBT. If you know PBT from other languages then I hope this convinces you to have a go with Kotlin and Kotest. Live long and prosper!