How do unit tests facilitate refactoring without introducing regressions?

  softwareengineering

My final attempt to ask this question… if it gets poorly received for the 3rd time, I think I’ll give up.

We all know the standard TDD cycle: I first write a unit test, the smallest unit test that fails; then I write some production code, again the smallest amount of production code that is sufficient to pass the test; finally I refactor. In this way, it is hoped, I will produce code that is 100% covered by unit tests.

Note that there is an emphasis on unit here. Tests produced in this process are supposed to limit their scope only to the method I am changing; every method must be accompanied by a test that tests this particular method. Also all dependencies of the class that contains this method are mocked. Each class, therefore, will be accompanied by a mock, much like each method will be accompanied by a test.

This, it is believed, will ensure correctness of the code. Without such coverage the risk of bugs will be far greater. Also, perhaps even more importantly, it is believed that such a code will allow fearless modifications without breaking it. Without a 100% coverage by unit tests any change in untested code risks breaking it; so any change of business requirements, any attempt at refactoring is likely to introduce regressions, making the code fragile, untrustworthy and solidifying its current state.

I cant see how is this the case. In my mindset, unit tests are, rather, inherently hostile to refactoring, while providing little confidence in the correctness of the code.

Assume I have a unit of code and a test for it. Now time comes to refactor. Trivially this test can never guard against regressions in any other place of my code, because all dependencies are mocked. But it also can’t guard against regressions in this very piece of code it is associated with! Whenever change this unit of code, or maybe even remove it completely, I will also have to rewrite or even completely remove the test that guards it. The test, therefore, gets removed right before it could get useful. It was, therefore, useless and writing it was a wasted effort.

Unit tests are inherently tied to the way code is implemented. They are, by definition, tied to particular classes and methods. They, therefore, by their very existence, solidify the way code is implemented. Code cannot be refactored without modifying the tests as well. Refactoring is, thus, rendered difficult.

Finally, many bugs stem from misunderstandings of other pieces of code I’m working with, or misunderstandings of the contracts of 3rd party APIs. Errors, therefore, lie not in any particular method or class, but rather in the way methods or classes interact. Unit tests can never catch such errors.

Note I am not trying to say that automated tests are useless. However, it does seem to me that integration tests can be better suited here. Firstly, integration tests don’t force me to write mocks, meaning they incur less overhead. Secondly, integration tests, by their nature, test the way code behaves, rather than the way it is written. The refactoring of the internals of the code, therefore, will leave an integration test untouched, indeed letting the test guard against regressions. Ideally, an integration test will only change if business requirements do.

Of course, there are things that cannot be tested by integration tests. For example, if a method must guard against error conditions that are never supposed to happen in reality, then I must write a unit test for it. Also, if I have lots of logic, with complex algorithms, then I also must test this logic and these algorithms in particular. For example, if I forgot that an implementation of a hash map is probably already present in my language’s standard library and wrote one myself, I would most likely also have to write tests for this hash map in particular. Even then, however, I may limit myself to testing the input/output of the algorithm I’m writing, without testing particular methods and classes that implement it.

A well written set of tests at a higher level of integration may already achieve large code coverage. If code is added to handle some special cases, then this may be covered by adding these edge cases to the suite of integration tests.

Still, I can’t see how can mocks be useful. They only seem to me to lower the confidence I can put in my test. Separating logic from input/output may help reduce the need of mocks.

This seems to give a rise to two approaches, that despite superficial similarity, in practice tend to yield very different code. The first approach is to first focus on achieving 100% coverage with unit tests while mocking everything, only rising to the level of integration tests if something cannot be tested by unit tests. The second approach is to start from integration tests, maybe even locking oneself in a TDD-like cycle. Mocks will be avoided if possible, unit tests will be written only if necessary.

My question is, how does the first of these approaches (ie sticking to the pyramid of tests and covering the code by lots of unit tests that use mocks) help grant confidence in correctness of the code and enable fearless refactoring? Where is my error in my reasoning that unit tests grant little confidence and are hostile to refactoring, as opposed to integration tests?

Many experienced coders vehemently argue for focusing on unit tests, hinting I must be missing something.

LEAVE A COMMENT