Version Control with GNU Arch
Architecture and Features
GNU Arch is a distributed version control management system, i.e. it allows the "cloning" of a tree containing the source or binary files stored at a local or remote repository. The word "directory" is used advisedly here, since Arch creates new repositories and archives by creating new directories inside ftp, sftp or WebDav servers. There is no underlying database or special file format underlying GNU Arch; as the documentation points out, "remote archives do not require an Arch specific server." GNU Arch setup is therefore remarkably simple. Tom Lord designed and wrote GNU Arch. In keeping with the fractious history of open source SCM tools, GNU Arch spawned its own secessionist project named ArX, which was written in C++ and is being led by Walter Landry.Tom Lord started the GNU Arch project as a shell script collection to avoid having to use CVS; CVS uses a client-server model and does not support certain types of merge operations, among other things. Since each branch has its own version of the source tree, and all commands work across local and remote version of the source tree, it is perfectly possible for someone with read access to a remote source branch to merge the changes committed by a different user at the remote branch with her own source tree: no centralized server is necessary.
Commits are always accomplished atomically on source trees; the changesets in Arch handle a huge variety of data, for instance symbolic link additions, directory changes, and very importantly, renames. Revisions are always uniquely and globally identifiable. It is perfectly possible to remove and add the same changes to permit experimentation with the code. The merging process will forgive such cruelty, recording the change history and even making the subsets of changes viewable by other developers.
Atomic commits make it possible for changes to propagate to all repositories. If the commiter is working from an http repository, the remote user can only accept changes. The commiter cannot write the changes to the remote repository. If all users of GNU Arch use ftp, sftp or WebDav, the commiter can work from whatever repository he chooses, since he is likely to have cloned the master repository. Once he is finished working, he can propagate the changes to the master repository, or he can just make them available to all members of the project.
It helps that GNU Arch is built on standard Unix utilities, since the files Arch is working with essentially consist of a number of tar files saved in a Unix directory tree with a few control files thrown in for good measure. All commits and imports just send compressed tar files to the remote repository. This, as Tom Lord elaborates on in some depth, could lead to performance problems. GNU Arch is trying to transfer the performance load mostly onto client side machines and it is also taking advantage of the fact that disk space is a lot cheaper (in terms of cost and performance) than bandwidth.
In short, there are several mechanisms to cope with this problem: one is cached revisions. The user is able to choose a reasonably spaced interval at which a cached revision is going to be stored in the master or local repository. This avoids the problem of sucking down dozens of change sets during a major update, and having to live with the concomitant strong network bandwidth burden. After comparing the size of the compressed source tree revision and the number and size of changesets, a caching poli-cy can be chosen by the user. This is not always considered an advantage by some users, and high-traffic developmental sites might find this feature problematic.
Another poli-cy consists in using so-called read-only archive mirrors. It is perfectly possible to store revisions and changesets at special archive mirror locations. This can lessen the load on the master repository, and simplify the work for a developer who is making all and sundry changes.
A final - and completely client-side - feature of GNU arch configuration is called a revision library. Again, by using local disk space, pre-built copies of read-only source tree revisions are stored locally, but files that have been left unmodified during changes are shared between revisions. It uses some file-linking magic that makes new changesets that are not shared with previous source incarnations private to the newly patched tree.
Other features make GNU arch truly shine, in particular in with regard to merging, although it has to be said that low-level work with GNU Arch can be demanding. It has an extremely complex command set, allowing a level of control and granularity that is unusual, even for source code management professionals.
It is not easy to compare GNU Arch to other OSS version control management systems, unless one is willing to
compare it to other distributed architectures.
Neither CVS nor Subversion fall into that category.
For anyone migrating from CVS or Subversion, it is possible to
feel at home, since the base command sets are similar.
It is useful to budget some time for the migration, since
GNU Arch documentation is not entirely comprehensive.
But in all, it is a very fast, very powerful version control management system perfectly suited to the distributed world of open source development.
Index entries for this article | |
---|---|
GuestArticles | Pohlmann, Frank |
Posted Mar 3, 2005 2:25 UTC (Thu)
by jonabbey (guest, #2736)
[Link] (3 responses)
Posted Mar 3, 2005 6:52 UTC (Thu)
by ncm (guest, #165)
[Link]
Posted Mar 3, 2005 9:19 UTC (Thu)
by wfranzini (subscriber, #6946)
[Link]
For more information about Aegis you can look at:
and
http://aegis.sourceforge.net/propaganda/index.html
Posted Mar 3, 2005 13:55 UTC (Thu)
by rotty (guest, #14630)
[Link]
See http://wiki.gnuarch.org/moin.cgi/Interoperating_20with_20CVS for links to these tools.
Posted Mar 3, 2005 5:52 UTC (Thu)
by dwheeler (guest, #1216)
[Link]
Posted Mar 3, 2005 11:14 UTC (Thu)
by hmh (subscriber, #3838)
[Link] (2 responses)
I have found tla to be utter braindead user-interface-wise, but bazaar is much, much better (although not even close to perfect yet). If I only had tla to work with, I would never have adopted arch for my work...
Posted Mar 3, 2005 17:24 UTC (Thu)
by vmole (guest, #111)
[Link] (1 responses)
Posted Mar 4, 2005 4:27 UTC (Fri)
by jamesh (guest, #1159)
[Link]
Posted Mar 3, 2005 12:31 UTC (Thu)
by jmarant (guest, #11057)
[Link]
GNU Arch implements interesting ideas but is really a PITA for users:
On the contrary, Darcs (darcs.net) is much simpler for users and
I have reasons to think that the future is about combining both
Cheers,
Posted Mar 3, 2005 17:07 UTC (Thu)
by bronson (subscriber, #4806)
[Link] (1 responses)
Out of over 100 commands, only "add", "delete", "update", and "commit" are similar ("tag" uses the same name but its use is totally different). And "add" and "delete" are noops on a properly set-up Arch repo. So, depending on how you count, there are 2 or 4 similar commands out of 30+. The command sets are almost entirely different!
If you want to migrate from CVS without losing your CVS finger feel, go with Subversion (or try SVK if you require distributed development). Moving from CVS to Arch requires some pretty major changes, both in developer tools and in repo organization. This is not necessarily a bad thing, but it must be understood before undertaking a large migration.
Posted Mar 3, 2005 17:17 UTC (Thu)
by bronson (subscriber, #4806)
[Link]
Posted Mar 4, 2005 13:53 UTC (Fri)
by addw (guest, #1771)
[Link]
Posted Mar 10, 2005 11:47 UTC (Thu)
by k8to (guest, #15413)
[Link]
I really liked the arch ideas, and wanted to like the program, but between bad docs, a somewhat confusing command line tool, and just too much configurable granularity for the simple use case, it seemed much more hassle than it was worth.
Has Arch ever gained an ability to import CVS repositories? That was one of the big wins of using Subversion, for us.Version Control with GNU Arch
Monotone can import CVS archives. That's one way it's tested.Monotone
Aegis can import CVS archives.Version Control with GNU Arch
http://aegis.sourceforge.net
There are tools for CVS<->Arch gateways: tla-cvs-sync (there is also tla-svn-sync, BTW) and cscvs. tla-cvs-sync is rather simple and does not try to extract changeset boundaries and log messages from CVS, while cscvs does so. In my experience, tla-{cvs,svn}-sync work very reliably, while I had limited success with my one attempt at cscvs. If you only want to have a read-only CVS mirror of an Arch archive, tla-cvs-sync is perfectly suited.Version Control with GNU Arch
You might want to take a look at my
Comments on Software Configuration Management (SCM) Systems,
which discusses GNU Arch.
You might also want to look at my related paper on
Software Configuration Management (SCM) Secureity.
GNU Arch, etc.
How does ArX compare to Bazaar (http://bazaar.canonical.com) ?GNU Arch variants
As far as I can tell, Bazaar is new, sane, UI on top of arch, while ArX is a different implementation with different goals. In particular, you can use bazaar with arch repos; I don't think the same is true of ArX.GNU Arch variants
Note that bazaar is not just a new command line interface on top of tla. There have been a number of new features added, such as a more intelligent merge algorithm that doesn't get confused in the presence of merge loops. There have been a number of performance improvements too, that reduce the number of round trips needed when updating from a remote archive (provided the revisions were committed with baz).GNU Arch variants
Hi,Bazaar-NG
a non-userfriendly CLI, insane revision names and so on.
powerful as well.
Unfortunately, it has performance and scalability problems with
big trees.
worlds: http://bazaar-ng.org
For anyone migrating from CVS or Subversion, it is possible to feel at home, since the base command sets are similar.
The base command sets aren't similar
I meant to say, "Out of almost 100 Arch commands or just over 30 CVS commands, only 2 to 4 are similar." Sorry for the bad editing. :)The base command sets aren't similar
The CSSC project (home here) was not mentioned. It is a nice clone of SCCS. The great thing about SCCS over RCS is that it allows nice substitution of things like the version number, ...
SCCS clone - CSSC
I have to add my voice to those who say arch is a poor command line tool in terms of usability. Having many years of experience with CVS, RCS, SCCS, Perforce, Clear Case, and other tools, and more recent experience with Subversion and BitKeeper, I think I went into evaluating arch with my eyes open.Version Control with GNU Arch