I’m looking for an advice reflecting my point of view. First, I will explain my point of view.
My point of view
I’m working in a big company as a C++ developer, we are using GIT. I’m going to make a talk on how our developers should organize GIT commits. The main idea is that most of commits should be pure refactoring with no functional changes.
The workflow of implementing a feature A
should look as follows:
- Find out whether
A
fits to current architecture - If not, refactor current architecture so that feature
A
now fits into it - Implement
A
smoothly, with as little additional changes as possible
The workflow of fixing a bug B
should look as follows:
- Find out the root cause of
B
and whether it is obvious and easy to fix - If not, refactor current code so that
B
is obvious and easy to fix - Fix
B
smoothly, with as little additional changes as possible
In both cases there are as little functional changes to the code as possible. I assume that refactoring changes are better than function changes since they are less bug-prone (some may be automated), easier to test (nothing should change at all) and thus are easier to review (requires less attention from a reviewer).
The ideal situation is when you do several refactoring commits with possibly many lines of code changed, and then a one-line commit which actually fixes the bug. Since bug fixes require high attention from reviewers, the shorter they are, the better.
To enforce this idea I recommend to strictly categorize each commit into one of the following types, explicitly marking each commit message with a little symbol:
- (~) Marks pure refactoring
- (-) Marks bug fix
- (+) Marks new feature implemented
- (*) Marks changes in existing features, which are not due to bug fixes
- (=) Marks some trivial changes (comments, formatting, rename files and variables, etc.)
The recommendation is to favor (=) and (~), and make sure (-) and (*) are as small as possible. As for (+), well… sometimes we just need to write some code 🙂
The question
I would like to hear some others experts opinion about this approach – is this sensible? Are there pitfalls? Has someone experience with a similar model? So far I found this related: Commit Often, Perfect Later, Publish Once: Git Best Practices, but not completely relevant. I don’t remember such recommendations in Macconel’s Code Complete, and IIRC it does not recommend how to organize code changes.
Note that the topic is not specific to GIT, but since GIT has excellent tools for local history management and commit crafting, the topic is tightly bound to GIT.
7
In both cases there are as little functional changes to the code as possible. I assume that refactoring changes are better than function changes since they are less bug-prone (some may be automated), easier to test (nothing should change at all) and thus are easier to review (requires less attention from a reviewer).
They should not require less attention from a reviewer. If they do, that should tell you that you do not treat all your code professionally. Review all your code will the same level of attention and details – you will be much better off.
We did this in one of my past projects as well (“unit test code is not as important as the implementation code; it will not require the same high quality standards;” – as a result, we had to spend a two weeks, half way through the project, just to clean and refactor the test code).
I would like to hear some others experts opinion about this approach – is this sensible?
I think it is not a sensible approach: what you are proposing is the equivalent of a code naming convention, for message commits:
These rules need to be documented, and the documentation needs to be maintained, preferably along with the place where the documentation applies (probably you will need to add it to the same git repository).
It may be much better if you propose a word-tag based approach, and suggest an extensible set of tags for everyone. I tend to use “refactor: “, “cleanup: “, and “+” for adding new code: “+flip module; +flip module unit tests”. This still requires separate documentation for the tags, but not for understanding them at a glance.
Since you are establishing these rules in a larger company, there may also be teams in there where your conventions do not apply (is there any CLI scripting team in your company? they will probably want to structure their commits differently).
There may also be commits that deal with more than one aspect (refactoring and code cleanup in the same commit for example), or teams that may need to add other categories (something for the build system, something for working on docs only, something for maintaining third party libraries in the git repo, etc).
Instead of specifying the commit tags as a standard, I would consider using some more generic rules:
-
write commits in a present tense, imperative form
-
use the header + blank line + details format (or don’t, but either way, specify the format you use)
-
specify a branching policy
-
allow private branches with different/custom/no rules (as long as they are private, they shouldn’t affect anyone but the author(s), and they should increase productivity.
4