tl;dr: You can often get a lot more mileage out of time spent writing tests if you can find properties to test using randomized data instead of (or as well as) cases to test with specific data.
I was working on a project recently where I was adding encryption capabilities to something. I wrote a test for it that looked something like this (heavily pseudo-coded):
byte[] theData = new byte[]{10,10,10,10}
byte[] theKey = new byte[]{1,2,3…} // 16 bytes
byte[] theInitVector = new byte[]{4,5,6…} // 16 bytes
assertEqual(theData,
decrypt(encrypt(theData, theKey, theInitVector),
theKey, theInitVector)
The traditional path to follow at this point would have been to expand this to cover all the edge cases I could think of, with setting various values to 0, or -1, or making the data to be encrypted very large or very small. None of these really captures what I’m trying to test, however: I want to know that no matter the data, key, or IV passed into my code, encrypting and then decrypting the data gives me back the original data. In an ideal world, the test would look something like:
byte[] theData = anyByteArray(anyLength())
byte[] theKey = anyByteArray(16)
byte[] theInitVector = anyByteArray(16)
assertEqual(theData,
decrypt(encrypt(theData, theKey, theInitVector),
theKey, theInitVector)
This would much better express the intent of my test to future developers, and would cover all of the things I might have tried to manually find edge cases for in less code.
The above is an example of a generative or property test – essentially, testing that certain properties hold across many generated inputs, rather than testing specific cases. Finding good properties to test for a given piece of code can be tough, but when you can find them, testing in this fashion can be very effective. In my example, my generative tests uncovered a bug that occurred roughly once in every 1000 cases!
Here are some ideas for ways to build property tests:
(1) Simple Properties: Any property you can express as a single equality. These are the most straightforward, but often the hardest to find. For most code these will manifest as round-trip-integrity types of properties similar to above
(2) “Model” Properties: These are cases where your code should behave very similarly to some other code which is known to be correct. Cases would include testing a key/value store to ensure that it behaves the same as a Map, or testing an optimized version of some code against the original.
(3) Fuzzing, or “Don’t crash” Properties: Rather than checking true correctness, these tests merely ensure that no matter the data supplied or operations applied to a piece of code, no exceptions are thrown. Useful mostly for stateful objects with complex internals.
It’s important to note that while generative tests can be a powerful tool, they are not a replacement for other kinds of tests, especially if there are specific edge cases that you know need to be tested. There are plenty of behaviors that can be hard or impossible to properly capture as properties, and these deserve their own tests.
Generative testing can be accomplished via any number of libraries, or liberal use of your language’s random number generator – the only particular point to worry about is making sure that if a given example fails, you’re able to reproduce it. Generally this is done by using a single random number generator for all inputs and preserving the seed value of that generator.
This post is by Porter Westling, senior engineer. This is part of Friday Thoughts, a post series on improving best practices throughout LiveRamp’s engineering organization. Do you like engineering teams that continuously seek to improve themselves? We’re always hiring.