Jump to content

Apache Subversion

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Adamantios (talk | contribs) at 16:37, 14 July 2009 (→‎Features: redirects to WebDAV article). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Subversion
Developer(s)CollabNet
Initial releaseOctober 20, 2000 (2000-10-20)
Repository
Written inC
Operating systemCross-platform
TypeRevision control
LicenseApache License
Websitehttp://subversion.tigris.org/

Subversion (SVN) is a version control system initiated in 2000 by CollabNet Inc. It is used to maintain current and historical versions of files such as source code, web pages, and documentation. Its goal is to be a mostly-compatible successor to the widely used Concurrent Versions System (CVS).

Subversion is well-known in the open source community and is used on many open source projects, including Apache Software Foundation, KDE, Free Pascal, FreeBSD, GCC, Python[needs update], Django, Ruby, Mono, SourceForge.net, ExtJS and Tigris.org. Google Code also provides Subversion hosting for their open source projects. BountySource systems use it exclusively. Codeplex offers access to both Subversion as well as other types of clients.

Subversion is also being adopted in the corporate world. In a 2007 report by Forrester Research, Subversion was recognized as the sole leader in the Standalone Software Configuration Management (SCM) category and a strong performer in the Software Configuration and Change Management (SCCM) category.[1]

Subversion is released under the Apache License, making it open source.

History

Subversion was started in 2000 as an effort to write a open source version control system which operated much like CVS but which fixed the bugs and supplied the features missing in CVS. By 2001, Subversion was sufficiently developed to be capable of hosting its own source code.[2]

Features

  • Commits are true atomic operations. Interrupted commit operations do not cause repository inconsistency or corruption.
  • Renamed/copied/moved/removed files retain full revision history.
  • Directories, renames, and file metadata (but not timestamps) are versioned. Entire directory trees can be moved around and/or copied very quickly, and retain full revision history.
  • Versioning of symbolic links.
  • Native support for binary files, with space-efficient binary-diff storage.
  • Apache HTTP Server as network server, WebDAV/DeltaV for protocol. There is also an independent server process that uses a custom protocol over TCP/IP.
  • Branching and tagging are cheap operations, independent of file size, though Subversion itself does not distinguish between a tag, a branch, and a directory
  • Natively client/server, layered library design.
  • Client/server protocol sends diffs in both directions.
  • Costs are proportional to change size, not data size.
  • Parsable output, including XML log output.
  • Open source licensed — "CollabNet/Tigris.org Apache-style license"
  • Internationalized program messages.
  • File locking for unmergeable files ("reserved checkouts").
  • Path-based authorization.
  • PHP, Python, Perl, and Java language bindings.
  • Full MIME support - the MIME Type of each file can be viewed or changed, with the software knowing which MIME types can have their differences from previous versions shown.

Repository types

Subversion offers two types of repository storage — FSFS (Fast Secure File System) and Berkeley DB. FSFS works faster on directories with a large number of files and takes less disk space, due to less logging.[3] Subversion has some limitations with Berkeley DB usage leading to repository corruption and data loss when a program that accesses the database crashes or was terminated forcibly. When using Berkeley DB repository, the only way to use it safely is on the dedicated server and by a single server process running as one user, according to Version Control with Subversion.[4] Existing tools for Berkeley DB repository recovery aren't completely reliable, so frequent repository backups are needed.

Repository access

As of version 1.4, Subversion repositories can be accessed by the following means:

  • Local filesystem or network filesystem,[5] accessed by client directly. This mode uses the file:///path access scheme.
  • WebDAV/DeltaV (over http or https) using the mod_dav_svn module for Apache 2. This mode uses the https://host/path access scheme.
  • Custom "svn" protocol (default port 3690), using plain text or over SSH. This mode uses either the svn://host/path access scheme for unencrypted transport or svn+ssh://host/path scheme for tunneling over ssh.

All three means can access both FSFS and Berkeley DB repositories.

Subversion clients in Version 1.5 support access to WebDAV/DeltaV (over http or https) subversion servers in Version 1.4. or later.

Layers

Subversion is composed internally of several libraries arranged as layers. Each performs a specific task and allows developers to create their own tools at the desired level of complexity and specificity.

Fs
The lowest level; it implements the versioned filesystem which stores the user data.
Repos
Concerned with the repository built up around the filesystem. It has many helper functions and handles the various 'hooks' that a repository may have, e.g. scripts that are run when an action is performed. Together, Fs and Repos constitute the "filesystem interface".
mod_dav_svn
Provides WebDAV/Delta-V access through Apache 2.
Ra
Handles "repository access", both local and remote. From this point on, repositories are referred to using URLs, e.g.
  • file:///path/ for local access,
  • http://host/path/ or https://host/path/ for WebDAV access, or
  • svn://host/path/ or svn+ssh://host/path/ for the SVN protocol.
Client, Wc
The highest level. It abstracts repository access and provides common client tasks, e.g. authenticating the user, or comparing versions. The Wc library is used by Client to manage the local working copy.

Filesystem

The Subversion filesystem can be described as "three dimensional". In addition to the two dimensions of a standard directory tree (e.g., tree view), the Subversion filesystem's third dimension is revisions. Each revision in a Subversion filesystem has its own root, which is used to access contents at that revision. Files are stored as links to the most recent change; thus a Subversion repository is quite compact. The storage space used is proportional to the number of changes made, not to the number of revisions.

The Subversion filesystem uses transactions to keep changes atomic. A transaction is begun from a specified revision of the filesystem, not necessarily the latest. The transaction has its own root, on which changes are made. It is then either committed and becomes the latest revision, or is aborted. The transaction is actually a long-lived filesystem object; a client does not need to commit or abort a transaction itself, rather it can also begin a transaction, exit, and then can re-open the transaction and continue using it. Multiple clients can access the same transaction and work together on an atomic change.

Properties

One important feature of the Subversion filesystem is properties, simple name=value pairs of text. Properties are used in two different places in the Subversion filesystem. The first is on filesystem entries (i.e., files and directories). These are versioned just like other changes to the filesystem. Users can add any property they wish, and the Subversion client uses a set of properties, which it prefixes with 'svn:'.

svn:executable
Makes files on Unix-hosted working copies executable.
svn:mime-type
Stores the MIME type of a file. Affects the handling of diffs and merging.
svn:ignore
A list of filename patterns to ignore in a directory. Similar to CVS's .cvsignore file.
svn:keywords
A list of keywords to substitute into a file when changes are made. The keywords must also be referenced in the file as $keyword$. This is used to maintain certain information (e.g., date of last change, revision number) in a file without human intervention.
svn:eol-style
Makes the client convert end-of-line characters in text files. Used when the working copy is needed with a specific EOL style. "native" is commonly used, so that EOL's match the user's OS EOL style. Repositories may require this property on all files to prevent inconsistent line endings, which can be a problem in itself.
svn:externals
Allows parts of other repositories to be automatically checked-out into a sub-directory.
svn:needs-lock
Specifies that a file is to be checked out with file permissions set to read-only. This is designed to be used with the locking mechanism. The read-only permission is a reminder to obtain a lock before modifying the file: obtaining a lock makes the file writable, and releasing the lock makes it read-only again. Locks are only enforced during a commit operation. Locks can be used without setting this property. However, that is not recommended, because it introduces the risk of someone modifying a locked file; they will only discover it has been locked when their commit fails.
svn:special
This property isn't meant to be set or modified directly by users. Currently only used for having symbolic links in the repository. When a symbolic link is added to the repository, a file containing the link target is created with this property set. When a Unix-like system checks out this file, the client converts it to a symbolic link.

The second place in Subversion where properties are used is on revisions themselves. Like the above properties on filesystem entries the names are completely arbitrary, with the Subversion client using certain properties prefixed with 'svn:'. However, these properties are not versioned and can be changed later.

svn:date
The date and time stamp of when the revision was made.
svn:author
The name of the user that submitted the change(s).
svn:log
The user-supplied description of the change(s).

Branching and tagging

Subversion uses the interfile branching model from Perforce[6] to handle branches and tags. Branching is the ability to isolate changes onto a separate line of development.[7] Tagging is creating a snapshot of the repository's content which unlike a branch is not expected to change in the future.

A new branch or tag is created with the 'svn copy' command, which should be used in place of the native operating system mechanism. Subversion does not create an entire new file version in the repository with its copy. Instead, the old and new versions are linked together internally and the history is preserved for both. The copied versions take up only a little extra room in the repository because Subversion saves only the differences from the original versions.

All the versions in each branch maintain the history of the file up to the point of the copy, plus any changes made since. Changes can be 'merged' back into the trunk or between branches. To Subversion, the only difference between tags and branches is that changes should not be checked into the tagged versions. Due to the differencing algorithm, creating a tag or a branch takes very little additional space in the repository.

Visualization of a very simple Subversion project.

Current limitations and problems

The current version of Subversion only allows directory access control and lacks more granular file access control. That problem dramatically restricts the use of Subversion in projects where directories are not structured to address functional separation among various objects. For example, directories like lib, src, bin do not address security and access control in most cases.

A known problem in Subversion is the implementation of the file and directory rename operation. Subversion currently implements the renaming of files and directories as a 'copy' to the new name followed by a 'delete' of the old name. Only the names are changed, all data relating to the edit history remains the same, and Subversion will still use the old name in older revisions of the "tree". However, Subversion may be confused when files are modified and moved in the same commit. This can also cause problems when a move conflicts with edits made elsewhere[8], for example during merging branches[9]. This problem was expected to be addressed in the Subversion 1.5 release, but only some use cases were addressed while the problems with some other use cases were postponed.[10]

Subversion currently lacks some repository administration and management features. For instance, it is sometimes desired to make edits to the repository to permanently remove all historical records of certain data being in the repository. Subversion does not have built-in support to allow this to be done simply.[11]

Subversion stores additional copies of data on the local machine, which can be an issue for very large projects or files, or if developers are working on multiple branches simultaneously. These .svn directories on the client side can become corrupted by ill-advised user activity.[12]

Releases

CollabNet is still involved with Subversion but the project is run as an independent open source community.[citation needed] The home of Subversion is on Tigris.org, an open-source community dedicated to software engineering tools.[citation needed]

The Subversion open-source community does not provide binaries but these can be downloaded from volunteers and from CollabNet, the initiator of the Subversion project. While the Subversion project does not include an official graphical user interface (GUI) for use with Subversion, a number of different GUIs have been developed, along with a wide variety of additional ancillary software.

See also

Notes

  1. ^ "The Forrester Wave: Software Change and Configuration Management, Q2 2007". Forrester Research.
  2. ^ "Subversion's History", section of Version Control with Subversion, version 1.4
  3. ^ Strategies for Repository Deployment
  4. ^ Ben Collins-Sussman, Brian W. Fitzpatrick, C. Michael Pilato. "SVN Documentation Chapter 5". O'Reilly.{{cite web}}: CS1 maint: multiple names: authors list (link)
  5. ^ Berkeley DB relies on file locking and thus should not be used on (network) filesystems which do not implement them
  6. ^ Inter-File Branching in Perforce
  7. ^ Branching / Tagging — TortoiseSVN
  8. ^ Implement true renames
  9. ^ Advanced Merging
  10. ^ Copy/move-related improvements in Subversion 1.5
  11. ^ svn obliterate
  12. ^ Downsides of Subversion 1.4 for configuration management in large-scale software development

References

  • C. Michael Pilato, Ben Collins-Sussman, Brian W. Fitzpatrick; Version Control with Subversion; O'Reilly; ISBN 0-596-00448-6 (1st edition, paperback, 2004, full book online, mirror)
  • Garrett Rooney; Practical Subversion; Apress; ISBN 1-59059-290-5 (1st edition, paperback, 2005)
  • Mike Mason; Pragmatic Version Control Using Subversion; Pragmatic Bookshelf; ISBN 0-9745140-6-3 (1st edition, paperback, 2005)
  • William Nagel; Subversion Version Control: Using the Subversion Version Control System in Development Projects; Prentice Hall; ISBN 0-13-185518-2 (1st edition, paperback, 2005)

Further reading