Many folks in our industry are familiar with unit testing, and practice it daily. Some folks get to the extreme of considering tests more important than production code, while others think it's just a waste of time, but unit tests have saved my bacon more times than I care to admit. They're a great way to ensure correctness, and adherence to business rules.

You may have been writing tests for a long time now, and yet not ever use one of the best tools there are: mocks, fakes and stubs. We'll see what they are, why they're useful, and what's the difference between each other.

Benefits of unit tests

An illustration showing a team around a computer screen, and a checklist on a clipboard with the title 'testing'.

First of all, if you don't already include testing in your development cycle, you should. The additional effort involved in writing and maintaining tests is more than paid back in peace of mind.

Expressing a requirement in a unit test will not only help you articulate its edge cases, but also inform you when further changes break expectations or cause regressions. Doing any significant refactoring without a comprehensive test suite is like free climbing: you may get away with it, and some folks love the feeling, but it's hardly something I'd recommend to the faint of heart.

Unit tests are also a great way to express the expected behaviour and contract of a piece of code. Unlike comments and external documentation, they cannot get out of date: any breaking change in the SUT (System Under Test; in this case the unit that is being tested) will cause well-written tests to go red and immediately alert the team.

Integration tests and validation tests are also very important tools to ensure the correctness of software we write, but in this post we're focusing mostly on unit testing.

There's plenty to be said about testing and its benefits[1], but that's beyond the scope of this article. I will leave it as an exercise to the reader to read up further on the topic — there are plenty of excellent articles and talks out there on the subject.

Doubles are here to help

In unit tests, you often find yourself having to deal with dependencies of the SUT. For example, testing an inventory management object may require an instance of InventoryRepository, which in turn depends on a Database, to save data to and read from. Since unit tests should never involve any system outside of the SUT itself, we find ourselves in a pickle: how do we test the functionality of our InventoryManager class without a real database?

class InventoryManager(private val repository: InventoryRepository) {
    fun totalStockCount(): Int =
    	repository.getItems().sumBy { it.stockCount }
    fun removeItem(sku: Int, count: Int = 1) {
    	repository.removeItem(sku, count)
    fun getItemStockCount(sku: Int): Int? =
    private fun getItem(sku: Int): InventoryItem? =
    	repository.getItems().find { it.sku == sku }
interface InventoryRepository {

    fun getItems(): List<InventoryItem>
    fun removeItem(sku: Int, count: Int = 1)
class DatabaseInventoryRepository(
	private val database: Database
): InventoryRepository {
	// ...

You need a so-called test double. There are three main kinds of test double, each with its strengths and weaknesses. Note that, in order to make the most of doubles, a well organised codebase with well thought-out abstractions will go a long way. In general, anything that should be swappable at test time with some double should be an interface, and not a concrete implementation.


A very popular solution is to provide a mock of the repository, using something like Mockito to create a non-functional implementation of the database interface. You can think of mocks as dummy, emptied out shells of your classes. By default, they don't do anything useful, but you can set them up to simulate a specific behaviour, such as returning a certain value when one of the methods is called:

val repository = mock<InventoryRepository> {
    on { getItems() } doReturn listOf(
        InventoryItem(sku = 123, name = "Banana", stockCount = 10),
        InventoryItem(sku = 456, name = "Potato", stockCount = 2)
val inventoryManager = InventoryManager(repository)


When you use mocks, you need to manually specify all the behaviour of the mock instance. This allows for great control over what the mock will do, but it can also lead to tests that are less meaningful, more verbose, and brittle.

The main issue with this usage of mocks is that they can lead to so-called white box testing, in which tests know exactly how a mock is interacted with by the SUT, and depend on implementation details, which is generally regarded as undesirable. Ideally, unit tests should only be concerned with the external API of the SUT, simulating how a user of that SUT may interact with it. This is called black box testing: the tests aren't aware of, nor depend on, the internals of the SUT.

One last caveat with mocks and mocking frameworks is that, by default, you cannot mock final classes. That's a potential issue considering that Kotlin classes are by default final, and it's certainly not ideal to make them all open just so that they can be mocked. There are workarounds, such as:

  • Using mock-maker-inline to allow mocking of final classes (be warned, this comes with a heavy performance penalty)
  • Using the allopen Kotlin compiler plugin (which is a bit of a hack and ends up creating bytecode that doesn't really match your sources)

They should be used very sparingly as they all come with some drawbacks and, more often than not, the mere need for such tools is a smell for some bigger problems. If you find yourself reaching for them, before adopting them, ask yourself: can I achieve the same results in any other way? Consider whether you could extract interfaces, and/or replace mocks with stubs or fakes.

Mocks aren't inherently evil

Now, before someone gets too riled up, I feel it's important to say that mocks are just a tool, and as such, they have a place in the testing toolbox. The important thing is not to use mocks for things they're not intended for.

Mocks allow you to test code that depends on code outside your control, for example. Mocks can be a last-resort approach to dealing with system APIs and third party libraries which aren't too testing-friendly. The recommended approach in such cases is not to mock things you don't control directly, but rather to wrap them in your own code, behind an interface, and mock that interface instead. Keep the wrapping as thin as possible, as it will be hard or impossible to properly unit test.

In general, mocking is useful to test behaviour: for example, making sure that your own BluetoothDevice#connect() calls the system API to connect to a Bluetooth device:

val device = BluetoothDevice("6E-89-23-98-F2-3B")
val bluetoothAdapter = mock<BluetoothAdapterWrapper> {
    on { connect(anyString()) } doReturn true


This would otherwise be tough to assert, since unit tests generally concern themselves with testing state after some action, and such interactions aren't always expressed as part of the public state of a SUT. It's arguably not a unit test's job to verify the interaction between separate components, but if you don't have integration tests or alternatives, it's better than nothing.


While mocks are useful for verifying behaviour, cases such as providing hardcoded return values can be better served by using stubs. A stub is a very minimal implementation of a class' public interface, that only ever returns hardcoded values. In our inventory example:

object StubInventoryRepository: InventoryRepository {

    override fun getItems() = listOf(
        InventoryItem(sku = 123, name = "Banana", stockCount = 10),
        InventoryItem(sku = 456, name = "Potato", stockCount = 2)
    override fun removeItem(sku: Int, count: Int = 1) {
    	// No-op, this is a stub
    override fun getItemStockCount(sku: Int): Int? = 100
val inventoryManager = InventoryManager(StubInventoryRepository)


If your stubs are used in more than one test, you'll save yourself from a lot of copy-pasting around of boilerplate code. Besides, if you don't need to verify interactions — and in unit tests you rarely should — stubs are the most intuitive and straightforward kind of double to use.


Stubs are great, but have one big drawback: they can't really do much more than return a hardcoded value. What if the interaction you are testing requires some more complex, stateful behaviour? Imagine we needed to check that after removing one item from the inventory, the stock count was actually reduced:

val inventoryManager = InventoryManager(StubInventoryRepository)

val sku = 123
val initialCount = inventoryManager.getItemStockCount(sku)
requireNotNull(initialCount) { "Item with SKU $sku not found in test data" }

inventoryManager.removeItem(sku, count = 1)
assertThat(inventoryManager.getItemStockCount(sku), "stock count after")
	.isEqualTo(initialCount - 1)

If we use stubs, this is, of course, going to fail:

<stock count after> expected to be 99, but was 100

That's because in order to test this behaviour we need our double to have a state, and stubs don't have one. Instead, we can use fakes, which are simplified implementations of the production code. For example, a fake InventoryRepository implementation may use a simple Map to store the current state, instead of a full-blown Database, and take several shortcuts:

class FakeInventoryRepository: InventoryRepository {
    private val items = mutableMapOf(
    	123 to InventoryItem(sku = 123, name = "Banana", stockCount = 10),
        456 to InventoryItem(sku = 456, name = "Potato", stockCount = 2)

    override fun getItems() = items.values.toList()
    override fun removeItem(sku: Int, count: Int = 1) {
    	val item = items[sku] ?: error("SKU $sku not found")
        require(item.stockCount - count >= 0) { "count larger than stock" }
    	items[sku] = item.copy(stockCount - count) 
    override fun getItemStockCount(sku: Int): Int? = 

Now, using a FakeInventoryRepository in our tests instead of the stub, we get a "good enough" rendition of the actual behaviour, and our unit tests work as we'd expect.

In this example, the fake implementation has a comparable complexity to the production implementation, but in real world scenarios, this tends not to be the case. It's worth noting that if you change the production code, you might also need to make changes to the fake, to keep the behaviour consistent.


We've seen that there are three kinds of test doubles: mocks, fakes, and stubs. Each has their own strengths, and each is more suited in certain scenarios. Mocks are great for verifying behaviour, but that's something that should almost never be required in unit tests, and can lead to brittle tests. Stubs are extremely simple to create and understand, but their stateless nature makes them unsuited for some more complicated cases. Lastly, while fakes are more flexible but require more effort to implement and maintain, they can still be preferable to mocks in most cases.

As a parting note, I want to stress that it's important to make sure that whatever production code you create a double for, needs to be properly unit tested itself (in the inventory example, you'd want unit tests for the repository too). Unit tests can't, and won't catch all kinds of bugs you may have; it's important to make sure you have other kinds of tests in place, such as integration tests that verify that all your system's components are working together as intended.

Thanks to Mark Allison, Daniele Conti, and Lorenzo Quiroli for proofreading ❤️


  1. On top of that, everybody has their own opinions on testing: what it is, how it's supposed to be done, where the line is between unit testing and integration testing, etc. What's presented in this post is, of course, my own opinion. You should form your own opinion; my goal is not to convince anyone I am right, but rather, to spur a conversation. ↩︎