An introduction to Version Control Systems for engineers
Every project an engineer works on creates a host of files that
- must not be lost or corrupted
- must be able to be modified in a restorable manner
The most common method I have seen could be summed up as 'Manual Backups and Version Numbers'. This system, or something like it, is the logical end point for an engineer that has been bitten by a file corruption or accidental file over-write. It tends to appeal to engineers for a few reasons,
- they thought of it themselves
- it only requires tools that they already use (the file browser, copy and paste) and
- it requires a disciplined operator (engineers are used to being rewarded for disciplined behaviour).
The system works like this
- Once a file has been developed to the point that it is considered important a version number gets added to its name
- If a big change is planned the file gets copied and the version number gets incremented
- whenever it is thought about the file gets backed up to another computer
This system appears to be pretty good. It is simple to understand and easy enough to follow (though a little arduous). It is easy to see some simple improvements such as using a file server to share the files with other engineers or an automatic backup script to garrauntee multiple copies. However there are problems with this system that can best be explained by introducing another solution to the problem.
The other solution to this problem is to use a Version Control System.
Version control systems are very common within the software development industry. Unfortunetly in my experience their use hasn't really caught on amongst the hard engineering disciplines. If the 'Manual Backups and Version Numbers' method is the logical conclusion for an engineer then a Version control system would have to be described as the logical conclusion for a software developer. The major goals that we were tackling with the method above still apply however they have been extended and automated within the Version Control System software. The goals that most VCSs strive to achieve are
- files, including all previous versions, must not be lost or corrupted
- every modification to a file should be stored in a restorable manner
- the history of modifications should be well documented including the time and date, the author, the files modified and a description of the modification
- any locally modified files should be easy to identify
- the modifications within files should be easy to identify
- the files should be editable by as many people as are required
- If the same file is edited by two people at the same time neither persons changes will be lost
The version control system that I have the most experience with is Subversion and it does a good job of meeting the the goals listed above. The areas in which Subversion excels compared with a manual system are in garraunteeing the validity of previous versions, managing the file history and the sharing of files between multiple engineers.
With any project, whether it is a single source code file or a directory full of documentation, there will be a large number of modifications to the important files. With a manual versioning system each new version is surprisingly taxing. On the surface it would appear as though this is untrue. To create a new version all you need to do is copy and rename the files, right. The taxing parts though are before and after this step. Prior to copying your files you need to identify which files have changed. This is particularly difficult if the change was accidental or made by another person. Worse though is the fact that once you have a few versions it becomes very difficult to identify which changes were made in which version. As a result people that use manual version control tend to limit the number of version they make.
On the other hand with subversion creating a new version is very cheap. At a glance you can see if your files have been modified, whether the same files have been modified by other people and if you are using text files you can see exactly which characters have been changed. Performing a commit is usually a two mouse button and one paragraph action. For this effort we also get to record the author of the changes, the exact time the modification was made, which files were affected and a brief description of the modifications. This extra detail is extremely useful when you need to track back through your changes. With Subversion the ease of creating a new version means that there is no fear about making a change to your files. You should commit a new version everytime you make a change that could cause a problem. You then can always track back to the exact point in which your troubles started.
Another important concept that a VCS like subversion enforces is that you are always working with a copy of the files, not the files themselves. This greatly reduces the risk of accidental modification and it is a virtue that is hard to bring to a manual system. The problem becomes particularly aparent when you look at a situation that involves the investigation of an error with a previous version of a file. Because you are able to gain access to the master copy of the old version of the file there is a risk that it will be modified during the investigation. You may well want to fix the error in the old version however unless you have made a copy first the state of that file when it originally was versioned becomes lost. What do you do then if the modification you made isn't correct after all? With Subversion that will never happen. When a version is commited you will always be able to return to that point in time. This is another reason why it is a good idea to commit regularly.
The extra information attached to each commit and the enforcement of always working with a copy of the files also makes the task of working with multiple engineers much more reliable. For a start it is a lot easier to know what changes other people have made when you can see who changed which file, when with an explanation about what they did. This also helps with project management as the improvements and fixes are documented as they are made. The conflict avoidance features, the cornerstone of which is that you are always working on a copy, remove the risk of multiple people overwriting each others changes.
All in all I think that engineers of all walks of life can benefit by making use of a version control system such as Subversion. It will greatly reduce the risk of losing important information, ease the process of tracking back through versions and help to manage files used by teams. In the future I will dive into some more specific information related to setting up and using Subversion. I'll also be looking into some of the specific problems associated with keeping engineering files in subversion.
For further information about Version control systems and Subversion
Windows client - http://tortoisesvn.tigris.org/
Manual - http://svnbook.red-bean.com/
