All version control systems have to solve the same
fundamental problem: how will the system allow users to share
information, but prevent them from accidentally stepping on
each other's feet? It's all too easy for users to
accidentally overwrite each other's changes in the
repository.
2.2.1. The Problem of File-Sharing
Consider this scenario: suppose we have two co-workers,
Harry and Sally. They each decide to edit the same repository
file at the same time. If Harry saves his changes to the
repository first, then it's possible that (a few moments
later) Sally could accidentally overwrite them with her own
new version of the file. While Harry's version of the file
won't be lost forever (because the system remembers every
change), any changes Harry made won't be
present in Sally's newer version of the file, because she
never saw Harry's changes to begin with. Harry's work is
still effectively lost - or at least missing from the
latest version of the file - and probably by accident.
This is definitely a situation we want to avoid!
2.2.2. The Lock-Modify-Unlock Solution
Many version control systems use a
lock-modify-unlock model to address
this problem, which is a very simple solution. In such a
system, the repository allows only one person to change a file
at a time. First Harry must lock the file before he can begin
making changes to it. Locking a file is a lot like borrowing
a book from the library; if Harry has locked a file, then Sally
cannot make any changes to it. If she tries to lock the file,
the repository will deny the request. All she can do is read
the file, and wait for Harry to finish his changes and release
his lock. After Harry unlocks the file, his turn is over, and
now Sally can take her turn by locking and editing.
The problem with the lock-modify-unlock model is that it's
a bit restrictive, and often becomes a roadblock for
users:
Locking may cause administrative problems.
Sometimes Harry will lock a file and then forget about it.
Meanwhile, because Sally is still waiting to edit the file,
her hands are tied. And then Harry goes on vacation. Now
Sally has to get an administrator to release Harry's lock.
The situation ends up causing a lot of unnecessary delay
and wasted time.
Locking may cause unnecessary serialization.
What if Harry is editing the beginning of a text file,
and Sally simply wants to edit the end of the same file?
These changes don't overlap at all. They could easily
edit the file simultaneously, and no great harm would
come, assuming the changes were properly merged together.
There's no need for them to take turns in this
situation.
Locking may create a false sense of security.
Pretend that Harry locks and edits file A, while
Sally simultaneously locks and edits file B. But suppose
that A and B depend on one another, and the changes made
to each are semantically incompatible. Suddenly A and B
don't work together anymore. The locking system was
powerless to prevent the problem - yet it somehow
provided a sense of false security. It's easy for Harry and
Sally to imagine that by locking files, each is beginning a
safe, insulated task, and thus inhibits them from
discussing their incompatible changes early
on.
2.2.3. The Copy-Modify-Merge Solution
Subversion, CVS, and other version control systems use a
copy-modify-merge model as an
alternative to locking. In this model, each user's client
reads the repository and creates a personal working
copy of the file or project. Users then work in
parallel, modifying their private copies. Finally, the
private copies are merged together into a new, final version.
The version control system often assists with the merging, but
ultimately a human being is responsible for making it happen
correctly.
Here's an example. Say that Harry and Sally each create
working copies of the same project, copied from the
repository. They work concurrently, and make changes to the
same file A
within their copies. Sally saves her changes to
the repository first. When Harry attempts to save his changes
later, the repository informs him that his file A is
out-of-date. In other words, that file
A in the repository has somehow changed since he last copied
it. So Harry asks his client to merge
any new changes from the repository into his working copy of
file A. Chances are that Sally's changes don't overlap with
his own; so once he has both sets of changes integrated, he
saves his working copy back to the repository.
But what if Sally's changes do overlap
with Harry's changes? What then? This situation is called a
conflict, and it's usually not much of
a problem. When Harry asks his client to merge the latest
repository changes into his working copy, his copy of file A
is somehow flagged as being in a state of conflict: he'll be
able to see both sets of conflicting changes, and manually
choose between them. Note that software can't automatically
resolve conflicts; only humans are capable of understanding
and making the necessary intelligent choices. Once Harry has
manually resolved the overlapping changes (perhaps by
discussing the conflict with Sally!), he can safely save the
merged file back to the repository.
The copy-modify-merge model may sound a bit chaotic, but
in practice, it runs extremely smoothly. Users can work in
parallel, never waiting for one another. When they work on
the same files, it turns out that most of their concurrent
changes don't overlap at all; conflicts are infrequent. And
the amount of time it takes to resolve conflicts is far less
than the time lost by a locking system.
In the end, it all comes down to one critical factor: user
communication. When users communicate poorly, both syntactic
and semantic conflicts increase. No system can force users to
communicate perfectly, and no system can detect semantic
conflicts. So there's no point in being lulled into a false
promise that a locking system will somehow prevent conflicts;
in practice, locking seems to inhibit productivity more than
anything else.
There is one common situation where the lock-modify-unlock
model comes out better, and that is where you have unmergeable
files. For example if your repository contains some graphic
images, and two people change the image at the same time, there
is no way for those changes to be merged together. Either Harry
or Sally will lose their changes.
2.2.4. What does Subversion Do?
Subversion uses the copy-modify-merge solution by default,
and in many cases this is all you will ever need. However,
as of Version 1.2, Subversion also supports file locking,
so if you have unmergeable files, or if you are simply
forced into a locking policy by management, Subversion
will still provide the features you need.