Subsections

1 Introduction

All version control systems have to solve the same fundamental problem: how will the system allow users to share information, but prevent them from accidentally stepping on each other's feet? It's all too easy for users to accidentally overwrite each other's changes in the repository.

Version control systems usually manage to keep track of changes and merge them together appropriately. However, there are cases when users do overlapping changes and the system cannot resolve the contradictory changes. This situation is called a conflict, and it's usually not much of a problem. When a user asks his client to merge the latest repository¹ changes into his working copy², his copy of file with conflicting changes are somehow flagged as being in a state of conflict: he'll be able to see both sets of conflicting changes, and manually choose between them. Note that software can't automatically resolve conflicts; only humans are capable of understanding and making the necessary intelligent choices. Once the overlapping changes are (manually) resolved - perhaps after a discussion with the party who made the conflicting changes - the merged file can safely be committed into the repository.

One way to avoid conflicting changes in files is to introduce locking of the repository while someone updates files³. This will prohibit concurrent changes of files, and keep conflicts to a minimum, but at the same time it becomes a nuisance since only one person in the team can make changes at the time. Of course, others can also work on the project and wait until they get to lock the repository for their commits, but after that they checked (sometimes without the support of a version control system) that their changes does not contradict other changes made before they got the locking power.

In the end, it all comes down to one critical factor: communication. When users communicate poorly, both syntactic and semantic conflicts increase. No system can force users to communicate perfectly, and no system can detect semantic conflicts. So there's no point in being lulled into a false promise that a locking system will somehow prevent conflicts.

1.1 Branching patterns

For projects that have a large number of contributors, it's common for most people to have working copies of the trunk⁴. Whenever someone needs to make a long-running change that is likely to disrupt the trunk, a standard procedure is to create a private branch and commit changes there until all the work is complete.

Of course, if you need to go back and fix bugs in previous releases of your software, you will need to checkout the revision used for that release, and make necessary changes and store the changes as a branch in the subversion tree.

1.1.1 Release branches

Most software has a typical life cycle: code, test, release, repeat. There are two problems with this process. First, developers need to keep writing new features while quality-assurance teams take time to test supposedly-stable versions of the software. Continued work cannot halt while the software is tested. Second, the team almost always needs to support older, released versions of software; if a bug is discovered in the latest code, it most likely exists in released versions as well, and customers will want to get that bug-fix without having to wait for a major new release.

Branching using subversion as suggested by the subversion book:

: Developers commit all new work to the trunk. Day-to-day changes are committed to /trunk: new features, bug-fixes, and so on.
: The trunk is copied to a “release” branch. When the team thinks the software is getting ready for release, then /trunk might be copied to /branches/1.0-stable.
: Teams continue to work in parallel. One team begins rigorous testing of the release branch, while another team continues new work on /trunk. If bugs are discovered in either location, fixes are ported back and forth as necessary. At some point, however, even that process stops. The branch is “frozen” for final testing right before a release.
: The branch is tagged and released. When testing is complete, /branches/1.0-stable is copied to /tags/1.0 as a reference snapshot. The tag is packaged and released to customers.
: The branch is maintained over time. While work continues on /trunk for next version, bug-fixes continue to be ported from /trunk to /branches/1.0-stable (or the other way around). When enough bug-fixes have accumulated, management may decide to do a 1.0.1 release: /branches/1.0-stable is copied to /tags/1.0.1, and the tag is packaged and released.

1.1.2 Feature branches

We insist on that /trunk and release branches compile and pass regression tests at all times. A feature branch is only required when changes require large numbers of destabilising commits. A good rule of thumb is to ask this question: if the developer worked for days in isolation and then committed the large change all at once (so that /trunk were never destabilised), would it be too large a change to review? If the answer to that question is “yes”, then the change should be developed on a feature branch. As the developer commits incremental changes to the branch, they can be easily reviewed by peers.

To avoid branches to grow to far apart from each others, the feature branches must be kept in sync with the trunk. We require that branch synchronisation is performed regularly (once every week) against the trunk. Remember to write proper log messages to keep track of merges.

When the development branch is kept in sync with the trunk it is straight forward to port the branch back to the trunk since all differences between the branch and the trunk are readily made in the branch. Basically, all that needs to be done is to merge by comparing the branch with the trunk.

1.1.3 Vendor branches

If a project depends on someone else's information, there are several ways to attempt to synchronise that information with the project. Most painfully, one could issue oral or written instructions to all the contributors of the project, telling them to make sure that they have the specific versions of that third-party information that the project needs. If the third-party information is maintained in a subversion repository, one could use subversion's externals definitions to effectively “pin down” specific versions of that information to some location in your own working copy directory (see the section called “Externals Definitions” in the subversion book).

The solution to this problem is to use vendor branches. A vendor branch is a directory tree in the version control system that contains information provided by a third-party entity, or vendor. Each version of the vendor's data that is decided to be absorbed into the project is called a vendor drop.

Vendor branches provide two key benefits. First, by storing the currently supported vendor drop in the version control system, the members of the project never need to question whether they have the right version of the vendor's data. They simply receive that correct version as part of their regular working copy updates. Secondly, because the data lives in the subversion repository, we can store your custom changes to it in-place - there is no more need of an automated (or worse, manual) method for swapping in customisations.

Managing vendor branches generally works like this. Create a top-level directory (such as /vendor) to hold the vendor branches. Then the third party code is imported into a sub-directory of that top-level directory. This sub-directory is copied into the main development branch (for example, /trunk) at the appropriate location. Local changes are always made in the main development branch. With each new release of the code we are tracking we bring it into the vendor branch and merge the changes into /trunk, resolving whatever conflicts occur between local changes and the upstream changes.

1.1.4 Subversion

Each time the repository accepts a commit, this creates a new state of the file system tree, called a revision. Each revision is assigned a unique natural number, one greater than the number of the previous revision⁵. The initial revision of a freshly created repository is numbered zero, and consists of nothing but an empty root directory.

It's important to note that working copies do not always correspond to any single revision in the repository; they may contain files from several different revisions. This happens if you commit a changed file, this file will get a new revision number while the rest stays at their current revision. The revision discrepancy will also happen if you svn update specific items in your working copy. To bring everything on par with the latest repository revision you must do an svn update in working copy root directory level.

Once you've finished making changes, you need to commit them to the repository, but before you do so, it's usually a good idea to take a look at exactly what you've changed. By examining your changes before you commit, you can make a more accurate log message. You may also discover that you've inadvertently changed a file, and this gives you a chance to revert those changes before committing. Additionally, this is a good opportunity to review and scrutinise changes before publishing them.

1.2 Donate Your Changes

⁶After making your modifications to the source code, compose a clear and concise log message to describe those changes and the reasons for them. Then, send an email to the developers list containing your log message and the output of svn diff (from the top of your subversion working copy). If the community members consider your changes acceptable, someone who has commit privileges (permission to make new revisions in the subversion source repository) will add your changes to the public source code tree. Recall that permission to directly commit changes to the repository is granted on merit - if you demonstrate comprehension of subversion, programming competency, and a “team spirit”, you will likely be awarded that permission.