Monday, February 16, 2009

Branching, with TFS

Branching is a powerful source control practice that allows you to create copies of the same source code which can be developed in parallel, then merged one way or the other. This allows you to do all kinds of powerful stuff, like:
  1. Track releases
  2. Isolate different development efforts
  3. Maintain stable code
In principle, it's very simple. In reality it can get really complicated. Like, really.

The main difficulty is that what branching strategy you use depends on what you're trying to accomplish. And you may be trying to accomplish different things at different times.

Microsoft has a good guidance document on branching (and TFS) up on codeplex: http://www.codeplex.com/TFSGuide

What I'm trying to accomplish with branches is:
  1. Allow developers to develop rapidly
  2. Allow developers to promote code which is "done" and prepare it for testing and release
  3. Track releases
Promotion Branching
Here's one branching structure that seems like it might accomplish that:
Dev -> Main -> Releases
In this structure, everyone develops in a common Dev branch. When their features are done, they merge them to Main. Then a Release branch is created from which RC builds and the final Release build are made.

It turns out that there is a huge problem with this structure. You have many developers working in the shared Dev branch. If they don't all finish their features at the same time (which they never will...), trouble strikes.

For example, imagine Bob makes some changes and adds a new file to the Project. When he checks into Dev his changeset will include a new file and a modification to the Project file. Now Susy makes some changes and adds a new file to the Project. Her changeset will include a new file and a modification to the Project file as well.

Lets say Susy finishes her work before Bob. That's cause Bob spent all his time on the phone with who knows who and got little to no work done, while Susy only stopped occasionally to check her makeup.

When Susy merges her work to Main by selecting only her changesets, she'll add her new file and her Project file modifications. Her Project file modifications include the file she added, but they also include Bob's file. This is because Bob added his file first. Unfortunately, her merge will not include Bob's new file because she didn't select Bob's changeset.

The result is a broken project file which is expecting to see Bob's new file, but it isn't there. This problem will occur anytime two people make changes to the same file and one of them merges before the other. Imagine how hairy this will be if the file they both changed is a code file and not just the Project file! I get shivers just thinking about it.

So we can't allow many developers to mix "unfinished" code because there is no simple way to untangle it.

Shelving
If we can't let people mix changes, then we can't have a shared Dev branch. Instead, we'll have to have people work completely separate from each other until they're done. We could do this with private branches. Or we can use Shelves.

In TFS, a Shelf allows you to save some pending changes to the server. You can "unshelve" them later, or even share them with other people. This is very similar to having many private named branches, but there is no change tracking.

The problem here is that the shelves are just enumerated in a huge list. So the more projects you have, the more shelves you have, the more of a nightmare working with shelves will be. Also, you have no change tracking.

Why not use private branches? For one, they wont be private, so they'll clutter the crap out of your Source Control Explorer. And for another, everyone says you shouldn't. Including me. No one every says why you shouldn't, including me, but that's what they say. So you should trust them. And also trust me.

By Release
Another approach would be to create our Release branches much earlier in the process. Then we'd work directly in the Release branch. This way, changes that are meant to go out at different times don't get all tangled up.

The problem here is that I don't think you're omniscient enough to make this work. What if a feature takes longer to develop than you thought (cause Bob wont get off the freaking phone). Do you push back the whole release? Do you just try to disable that code? When do you re-enable it?

What if you have many concurrent releases planned, one for next week, and one for the following week? To get the changes from next week's branch into the following week's branch you have to merge through Main. Which is ok, but will be something of a pain.

Plus, you have to make sure that all changes made in a Release branch get merged back into Main.

Clearly, a Release branch is not a good place to do development.

Feature Branching
Yet another approach involves creating Feature Branches. Each Feature Branch isolates the development of a particular feature from the development of everything else. When the feature is done, you merge the Feature Branch into Main and delete it.

If what you're working on is quick and doesn't deserve a whole branch, then just use shelves.

The first problem with this approach is that you have to be pretty omniscient as well. If you start work in a shelf, then realize you'd like to have a branch, the only way to make that work is by using the tfpt unshelve /migrate power tool. At least there's a way, and it's good enough for me, personally, but a lot of people will find it too much (like Bob, who when he gets off the phone is scared to death of the command line).

Another problem is the potential for a proliferation of feature branches. This depends on the size of your project, but you can easily imagine a situation where there are 4 or 5 different features being developed concurrently. And you'll constantly be creating new branches and deleting old ones. That's a lot of branches, and a lot of maintenance.

The Solution
Boy I wish I knew. I think the solution is Mercurial (which really needs a proper website...). But we use TFS. So the best I can come up with is Shelves and Feature Branches. You wouldn't happen to have any ideas I've overlooked, would you?

3 comments:

  1. We tend to use the feature branch approach mostly. Our branching structure is a little different:

    main
    -> release
    -> feature 1
    -> feature 2
    -> feature 3...

    In this model, "main" is the trunk. Feature work is done in the feature branches. When a feature is completed, we integrate it up to main and then back down into the other feature branches. Then, as we near a release, we branch from main into a release branch, iterate and stabilize in there. We also take regular integrations from the release branch back up to main (and thus back down to the features eventually) to pick up those last minute fixes we had to make in the release branch.

    You can cut down on the number of feature branches by logically grouping features together such that you minimize overlap between groups. I wouldn't create 1 branch per developer. For example, we typically have 8-10 devs and tester working in a feature branch. Sometimes we develop multiple features in parallel in a single feature branch if they're logically coupled or dependent upon each other.

    Hope this helps!

    ReplyDelete
  2. On the last really big project I was on, we took the "1 branch for each group of common features" approach. Similar to what Jason's team does, when a feature was complete, we merged from the trunk down to the feature branch, resolved any conflicts, then merged the feature branch into trunk, and finally deleted the feature branch.

    The approach served the purpose and allowed us to release features in phases, but it was a lot of work. We actually had one guy who's sole role for 2-3 months was to manage all the merges (at one point there was 14 different branches under development).

    If you find an alternative method, I would love to read about it in a followup post!

    ReplyDelete
  3. First off great article. Really insightful.

    My solution would be to fire Bob and hire someone qualified for his position. No but seriously I can see no plausible effective solution to this problem. I have often wondered the same thing to no avail. Every idea has just slightly too many problems to make it a good lasting choice!

    ReplyDelete

Note: Only a member of this blog may post a comment.