Should temporary code be put under version control and how?

Here are some examples of temporary/local code. It is needed in order to work with the codebase, but would be harmful to be part of it:

Project files. Paths may need to be edited in order to reflect the layout on the current PC.
Makefiles. For example optimization may need to be turned off during debugging, but not for the CI server.
Dirty ugly hacks. For example return 7 in the middle of a function, in order to test something, depending on the function, and suspected to break at value of 7. Or 7 is the code of the not yet implemented button, that I am implementing and need to test throughout the life of my branch.

I have tried keeping those in a git commit that I always rebase to the top before pushing to the repo and then push HEAD~. This is quite inconvenient and doesn’t work with svn. Stashing scares me even more – “did I remember to pop after pushing??”.

Keeping the code out of version control introduces unpleasant noise everytime a commit is being assembled plus it might accidentally get introduced into a commit some friday evening.

What would be a sane solution for such throw-away code?

All code is temporary. When I’m making changes I will introduce placeholders occasionally – that icon that I drew waiting for the real one from the designer, the function I know will call the library that my colleague is writing and hasn’t yet finished (or started), the extra logging that will be removed or otherwise made conditional, the bugs that I will get around to fixing once they’ve been noticed by the test team, etc

So check everything in. Use a feature branch for all your development, then you can merge the final version into trunk and no-one will need to know what hacks and bodges and fixes you made during your development cycle, they’ll only need to see the final version. But if you’ve committed to your branch regularly, you’ll be able to see the things that were worth keeping if a day went spectacularly wrong, or you continued coding after a lunchtime down the pub.

Version control is not a artifact repository or document storage system. Its about holding the history of changes. Stick everything you like in there becuase one day you might want to see what it was, and those are the days you realise what your SCM is truly about.

PS. Truly temporary files (eg .obj or build artifacts) have no place in your SCM. These are things that have no value to anyone. You can tell what they are – if you delete them you don’t mind, or even notice they’re gone.

Project files. Paths may need to be edited in order to reflect the layout on the current PC.

For project files, the best strategy is when you can generate the project file via a script. Add the actual project file to your ignores, and simply re-generate the project file as necessary. For example, in Java projects, I use gradle which can generate an eclipse project.

Makefiles. For example optimization may need to be turned off during
debugging, but not for the CI server.

You should be able to switch between optimization and debug mode without modifying your Makefile. Instead, use a command line flag, environment variable, or seperate file not in your repository to control that.

Dirty ugly hacks. For example return 7 in the middle of a function, in order to test something, depending on the function, and suspected to break at value of 7. Or 7 is the code of the not yet implemented button, that I am implementing and need to test throughout the life of my branch.

Can’t you write a test that induces the suspected failure case?

For most cases, you should be able to adjust your workflow so you don’t make these changes to files in your repository. Those files which are changed locally should be added to your project’s ignore mechanism and not be in the repository. In some cases, you’ll still make temporary changes you don’t want put into the repository. For those, add a special sequence like: XXX, and add a pre-commit hook that reject commits that still have it in there.

Version control should contain code and configuration which is needed to build the application.

This means that:

Temporary stuff which was introduced for a short amount of time (the time required to pinpoint the location of a bug, or to experiment with a feature of a language, for example) shouldn’t be in a version control: keep it until you need it, then simply remove it when doing the commit.
Local files which are proper to a particular machine may be kept in a branch.

I would avoid keeping them just locally, since it’s too painful to redo all this stuff when your laptop is stolen or a virus forces you to reinstall the OS (and, by the way, you find that your last backup was done two years ago).

On the other hand, be careful with file structure: local config is OK, until it becomes overwhelming, and forces you to make a single change in every file of every of 42 developers participating to the project.

Watch for opportunity to remove the particularities between the machines. It may mean:
- Giving an access to a dev SQL server to replace local instances on developers machines,
- Using package distribution services like Pypi or npm for public packages and their private counterparts for in-house packages,
- Ask members of the team to install the same versions of software,
- Make software updates as transparent as possible,
- Or make it possible to deploy the OS and the needed software on a machine in one click (plus the time for every developer to install his preferred Vim vs. Emacs, Chrome vs. Firefox, etc.)

So:

Project files. Paths may need to be edited in order to reflect the layout on the current PC.

Why not using the same layout on every PC? Paths within the project should be relative to the project file, which means that it doesn’t matter where the project is located. Versions of software and libraries are better to be the same to avoid cryptic bugs which appear on some machines only, and are impossible to reproduce for other members of the team.

Example:

In a project created with Visual Studio, you may find:

The files themselves. Paths being relative, it doesn’t matter whether on my machine, the project is located in H:DevelopmentHello World Project while other members of the team checked out the project into C:WorkHelloWorld.
The dependencies, i.e. third party and in-house libraries. Both types should be handled by NuGet which makes all conflicts-related discussions obsolete. If you don’t have the same version of the library I have, ask NuGet to update the dependencies. As simple as that (when it works well, which is not always the case).

Note that it is crucial to keep in-house libraries in a private NuGet as well. Having a bunch of libraries stored in a shared folder or sent by e-mail across a team leads to anarchy and depressive CI servers.
The settings. It’s crucial that the team shares the same settings. If half of the team decides to treat warnings as errors and half of the team keeps warnings as-is, the members of the first part of the team will spend their time removing warnings generated by the developers from the second part of the team.
The utilities-related settings. Those are tricky, because some members of the team may have installed some utilities, while others haven’t.

It is strongly recommended to have the same toolset installed. If some programmers want to use StyleCop, but others don’t, the team won’t get the job done. If some use Code contracts but others don’t, they will have the same issues.

Makefiles. For example optimization may need to be turned off during debugging, but not for the CI server.

Keep several makefiles in version control. It is not unusual to build a debug version on CI server as well and to push it to a client which experiences a tricky bug.

Dirty ugly hacks. For example return 7 in the middle of a function, in order to test something, depending on the function, and suspected to break at value of 7.

I would avoid such code in the first place. In order to test something, use unit tests. If it really takes a few seconds to swap some code for the purpose of debugging, then do it, but you’ll remove this code in a few minutes anyway, so there is no need to commit it.

As you describe it, you should write a test. For example, if you want to be sure that:

class TemperatureConverter
{
    public int CelsiusToFahrenheit(int temperature)
    {
        ...
    }
}

throws an exception when temperature is inferior to AbsoluteZero constant, you shouldn’t play with code itself. Instead, create a unit test which will:

self-document your code,
increase the reliability of your code,
ensure that maintainers can rely on regression testing when modifying the method above,
serve to other developers of your team who may need to do the same test.

We use @@ comments in the code to indicate anything not being quite ready, for testing purposes, etc.

That way we can commit, colleagues don’t have to wait too long to sync, and can see where there’s still work in progress (e.g. understand why a part is not yet fully working).

We do a global search for @@ to prevent any ‘leftovers’ before entering the final stages of beta testing etc.

Using that discipline, I see no reason to not just commit. This way, we don’t have separate branches and only one extra ‘protocol’ to follow.

As an extra benefit, these todo’s (usually small things) are always in the code. The developer working on them can quickly go over them, and there’s no need to keep separate lists.
You know how development goes: you are working in one place but you are constantly using your mind as a stack (‘I should change that over there when I’m done here‘). Just jotting down a quick @@ remark prevents stack overflow.

I even use @@name to indicate issues that I need to discuss with ‘name’.

2 HAMSTER solutions:

You can use a pre-commit hook to check your code for some unusual keyword like HAMSTER. Just don’t let people commit code that has been HAMSTERed and use it whenever you do dirty hacks.
Another option for example in C is to use #ifdef HAMSTER, then the code will only run on your machine where you have a compiler flag HAMSTER.

We put everything under source control needed to build and test the current binaries and understand why things were designed/implemented/tested the way they are.

That even holds for spikes http://www.extremeprogramming.org/rules/spike.html, like the ones you described; we just host them in a different sub-tree.

Here are are a number of solutions I occasionally use myself under various circumstances, and that you might consider helpful when applied to your own workflows:

Lightweight branches that can be squashed.

Git is great at this. Hack on a branch, make lots of commits, and then rebase or squash your history to edit out the noise.
Use a patch queue on top of your SCM.

I often find myself using StGit to float patches to the top of my current branch. When I’m done with the branch, I can pop them back off the stack before merging, squashing, or rebasing, or I can merge them into the main codebase if I want to keep them around.
Use RCS as an “out-of-band” SCM for small experiments.

Sometimes you just want to checkpoint a file in progress in a disposable way, without having to clean up history afterwards. I typically use RCS for this inside of Git or SVN. I tell Git to ignore RCS artifacts, checkpoint my work in progress in RCS, and when I like the results I just toss the *,v files or the whole RCS directory. Just don’t run git clean -fdx or similar until you’ve committed your work to your “real” SCM, or you’ll regret it.
Named stashes.

Another Git-ism, but handy: git stash save --include-untracked <some-cool-title> can be useful in a pinch. You can save, pop, and apply work in progress this way, and view your various checkpoints through git stash list or git reflog --all. Other SCMs may have similar features, but your mileage may vary a lot with this one.

Some of that temporary code is really just a manifestation of improper build/test/development methodology, and hopefully their existence will motivate future improvement.

On git at least, you should be free to mess around with any number of feature branches until they are ready to be merged into master/trunk.

Version control is supposed to help you, and more often than not I appreciate the insights from the way mistakes (or maybe just less-than-intuitive decisions) were made in the past, and make more informed decisions for the present.

I believe that some systems will throw warnings on seeing TODO in a comment, so

// TODO: remove this hack.

might be all that is necessary if you can find a relevant option in some part of your development environment, or just stick some sort of grep command in your buildfile. It might also be possible to arrange for // HACK or any arbitrary string to be picked up.

This is simpler than organising your code in a particular way and hoping that people will remember not to use it. It also makes it safer to follow @gbjbaanb ‘s advice (if you can ensure that everyone is seeing the warnings!).

Stick everything you like in there becuase one day you might want to see what it was, and those are the days you realise what your SCM is truly about.

It is never harmful to put code in source control.

Every single one of the items you mention should be in source control.

Filed under: softwareengineering - @ 01:17

Thẻ: version-control

Should temporary code be put under version control and how?

Here are some examples of temporary/local code. It is needed in order to work with the codebase, but would be harmful to be part of it:

Project files. Paths may need to be edited in order to reflect the layout on the current PC.
Makefiles. For example optimization may need to be turned off during debugging, but not for the CI server.
Dirty ugly hacks. For example return 7 in the middle of a function, in order to test something, depending on the function, and suspected to break at value of 7. Or 7 is the code of the not yet implemented button, that I am implementing and need to test throughout the life of my branch.

Keeping the code out of version control introduces unpleasant noise everytime a commit is being assembled plus it might accidentally get introduced into a commit some friday evening.

What would be a sane solution for such throw-away code?

Project files. Paths may need to be edited in order to reflect the layout on the current PC.

Makefiles. For example optimization may need to be turned off during
debugging, but not for the CI server.

Dirty ugly hacks. For example return 7 in the middle of a function, in order to test something, depending on the function, and suspected to break at value of 7. Or 7 is the code of the not yet implemented button, that I am implementing and need to test throughout the life of my branch.

Can’t you write a test that induces the suspected failure case?

Version control should contain code and configuration which is needed to build the application.

This means that:

Temporary stuff which was introduced for a short amount of time (the time required to pinpoint the location of a bug, or to experiment with a feature of a language, for example) shouldn’t be in a version control: keep it until you need it, then simply remove it when doing the commit.
Local files which are proper to a particular machine may be kept in a branch.

I would avoid keeping them just locally, since it’s too painful to redo all this stuff when your laptop is stolen or a virus forces you to reinstall the OS (and, by the way, you find that your last backup was done two years ago).

On the other hand, be careful with file structure: local config is OK, until it becomes overwhelming, and forces you to make a single change in every file of every of 42 developers participating to the project.

Watch for opportunity to remove the particularities between the machines. It may mean:
- Giving an access to a dev SQL server to replace local instances on developers machines,
- Using package distribution services like Pypi or npm for public packages and their private counterparts for in-house packages,
- Ask members of the team to install the same versions of software,
- Make software updates as transparent as possible,
- Or make it possible to deploy the OS and the needed software on a machine in one click (plus the time for every developer to install his preferred Vim vs. Emacs, Chrome vs. Firefox, etc.)

So:

Project files. Paths may need to be edited in order to reflect the layout on the current PC.

Example:

In a project created with Visual Studio, you may find:

The files themselves. Paths being relative, it doesn’t matter whether on my machine, the project is located in H:DevelopmentHello World Project while other members of the team checked out the project into C:WorkHelloWorld.
The dependencies, i.e. third party and in-house libraries. Both types should be handled by NuGet which makes all conflicts-related discussions obsolete. If you don’t have the same version of the library I have, ask NuGet to update the dependencies. As simple as that (when it works well, which is not always the case).

Note that it is crucial to keep in-house libraries in a private NuGet as well. Having a bunch of libraries stored in a shared folder or sent by e-mail across a team leads to anarchy and depressive CI servers.
The settings. It’s crucial that the team shares the same settings. If half of the team decides to treat warnings as errors and half of the team keeps warnings as-is, the members of the first part of the team will spend their time removing warnings generated by the developers from the second part of the team.
The utilities-related settings. Those are tricky, because some members of the team may have installed some utilities, while others haven’t.

It is strongly recommended to have the same toolset installed. If some programmers want to use StyleCop, but others don’t, the team won’t get the job done. If some use Code contracts but others don’t, they will have the same issues.

Makefiles. For example optimization may need to be turned off during debugging, but not for the CI server.

Keep several makefiles in version control. It is not unusual to build a debug version on CI server as well and to push it to a client which experiences a tricky bug.

Dirty ugly hacks. For example return 7 in the middle of a function, in order to test something, depending on the function, and suspected to break at value of 7.

As you describe it, you should write a test. For example, if you want to be sure that:

class TemperatureConverter
{
    public int CelsiusToFahrenheit(int temperature)
    {
        ...
    }
}

throws an exception when temperature is inferior to AbsoluteZero constant, you shouldn’t play with code itself. Instead, create a unit test which will:

self-document your code,
increase the reliability of your code,
ensure that maintainers can rely on regression testing when modifying the method above,
serve to other developers of your team who may need to do the same test.

We use @@ comments in the code to indicate anything not being quite ready, for testing purposes, etc.

That way we can commit, colleagues don’t have to wait too long to sync, and can see where there’s still work in progress (e.g. understand why a part is not yet fully working).

We do a global search for @@ to prevent any ‘leftovers’ before entering the final stages of beta testing etc.

Using that discipline, I see no reason to not just commit. This way, we don’t have separate branches and only one extra ‘protocol’ to follow.

I even use @@name to indicate issues that I need to discuss with ‘name’.

2 HAMSTER solutions:

You can use a pre-commit hook to check your code for some unusual keyword like HAMSTER. Just don’t let people commit code that has been HAMSTERed and use it whenever you do dirty hacks.
Another option for example in C is to use #ifdef HAMSTER, then the code will only run on your machine where you have a compiler flag HAMSTER.

We put everything under source control needed to build and test the current binaries and understand why things were designed/implemented/tested the way they are.

That even holds for spikes http://www.extremeprogramming.org/rules/spike.html, like the ones you described; we just host them in a different sub-tree.

Here are are a number of solutions I occasionally use myself under various circumstances, and that you might consider helpful when applied to your own workflows:

Lightweight branches that can be squashed.

Git is great at this. Hack on a branch, make lots of commits, and then rebase or squash your history to edit out the noise.
Use a patch queue on top of your SCM.

I often find myself using StGit to float patches to the top of my current branch. When I’m done with the branch, I can pop them back off the stack before merging, squashing, or rebasing, or I can merge them into the main codebase if I want to keep them around.
Use RCS as an “out-of-band” SCM for small experiments.

Sometimes you just want to checkpoint a file in progress in a disposable way, without having to clean up history afterwards. I typically use RCS for this inside of Git or SVN. I tell Git to ignore RCS artifacts, checkpoint my work in progress in RCS, and when I like the results I just toss the *,v files or the whole RCS directory. Just don’t run git clean -fdx or similar until you’ve committed your work to your “real” SCM, or you’ll regret it.
Named stashes.

Another Git-ism, but handy: git stash save --include-untracked <some-cool-title> can be useful in a pinch. You can save, pop, and apply work in progress this way, and view your various checkpoints through git stash list or git reflog --all. Other SCMs may have similar features, but your mileage may vary a lot with this one.

Some of that temporary code is really just a manifestation of improper build/test/development methodology, and hopefully their existence will motivate future improvement.

On git at least, you should be free to mess around with any number of feature branches until they are ready to be merged into master/trunk.

I believe that some systems will throw warnings on seeing TODO in a comment, so

// TODO: remove this hack.

Stick everything you like in there becuase one day you might want to see what it was, and those are the days you realise what your SCM is truly about.

It is never harmful to put code in source control.

Every single one of the items you mention should be in source control.

Filed under: softwareengineering - @ 01:17

Thẻ: version-control

Thiết kế website giá rẻ

Danh mục

Should temporary code be put under version control and how?

Should temporary code be put under version control and how?