Tuesday, February 26, 2008

Source Control Branching

Source Control systems have a number of purposes:
  1. Track every change made to the source
  2. Help multiple developers work together
  3. Help multiple teams work in parallel
Different source control systems are better at each of these. For example, Microsoft's Source Safe works fine for #1 and #2, but sucks at #3. This is because it doesn't really have branching support.

Microsoft's TFS, on the other hand, is quite good at all these things. It is every bit as nice as Subversion or Perforce or Vault. This is mainly because it has real branching.

However, unless your company consists of only 2 or 3 people you can't just start randomly branching things. You need a basic plan of some sort. A kind of branching strategy.

There are lots of resources out there that provide this kind of information. One of the best I have found, which is specific to TFS, is the Microsoft Team Foundation Server Branching Guidance. There have also been articles in the IEEE about branching, such as The Importance of Branching Models in SCM. And of course there are many blogs which touch on the subject such as Top 10 Tips On Version Control for Small Agile Software Teams.

All of these resources are helpful, but what they indicate is that what kind of branching strategy you use is really dependent on the structure of your organization. There does not seem to be a one size fits all method. For example, at Microsoft "Feature Teams" reign, and therefore Feature Branches are the way to go. At other big companies the development teams may not even be aware of the branching strategy. They just work in the environment provided for them and other people promote their changes for them. Its also possible that small teams have no need for branching at all.

What I find fascinating in all of this though is the lack of any discussion about Shared Components. I've talked about this in past posts like Two Versions of the Same Shared Assembly. In that series I was trying to find a way to support two different versions of an assembly in the same "Solution" at the same time. Ultimately I determined that there is just no solid way of doing this that is worth the effort.

Now I'd like to look at how having Shared Components affects a branching strategy. All the reference's I've linked to so far only consider a single project at a time. They don't talk about how a project's dependencies on other projects should be handled in source control.

Let's start by looking at what we want to gain from branching:
  1. Keep untested changes from being released
    1. To clients
    2. To dependent internal projects
  2. Allow projects to decide when to make changes available to other projects
  3. Allow projects to decide when to integrate changes from other projects
The first question: Should you depend on source (project references) or dlls?
The answer depends on your environment. If you need to be able to work on a dependent project and an application hand in hand, at the same time, you have to use source. If you want to be able to perform one step refactorings you also have to use source.

You might think that you could could switch between dlls and source as needed, but if you work out the details of this you'll see it doesn't work. Mainly because you end up working on a "release" source branch, which you shouldn't be making changes to.

So, if you have distinct teams that work on different projects with little overlap, you can probably reference dlls. But if you have people who work on a little of everything, you probably need to reference source.

The second question: What branching strategies should we consider?
  1. On Demand Application Isolation: Applications branch their code and all dependent code into a TEST or RELEASE folder when they want to isolate themselves, or "freeze" their dependencies. When they are not frozen they simple reference the since source control store of their dependencies.
  2. Team Promotion: Shared components do development work in a DEV branch and when the changes are ready for the wild they merge them into a PUBLIC branch. Applications reference their dependencies public branches.
  3. Team Promotion/Application Isolation: Combination of the other two. Applications reference their dependencies' PUBLIC branches. Then branch everything into TEST and RELEASE.
  4. Distributed Projects: Shared components have DEV and PUBLIC branches. Applications branch from PUBLIC into all of their branches, DEV, TEST, and RELEASE.
  5. Distributed Individuals: Use a source control system like Mercurial, or Git...
Of all of these Distributed Individuals is the most intriguing and flexible. However, it can't be done with a standard centralized source control store, so I have to disregard it for now.

Distributed Projects is the next most flexible. In this model Applications decide when to merge changes from the projects they depend on into their own branches of those projects. Then those applications push changes from DEV -> TEST -> RELEASES. The down side of this model is the huge number of branches and the long path to push changes from Module DEV -> Module PUBLIC -> App DEV -> App TEST -> App Releases and back the other way.

Because Distributed Projects has so much overhead I prefer the Team Promotion/Application Isolation model. This model is practically the same. It only differs in that the Application's DEV branch doesn't contain dependency branches. It references them directly. The downside of that is that the Application can't decide when to integrate changes from dependencies into DEV. However, it still decides when to integrate into TEST.

Note that all of these models would still take advantage of Feature Branches when it was helpful.

Hopefully this has made some sense to you... I was strongly tempted to create some diagrams to go along with all of this and make it easier to comprehend, but I simply didn't have time.

Do you work on projects that employ branching? Do you have to deal with shared components?

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.