Jump to content

Maildir

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 147.188.192.41 (talk) at 16:07, 10 November 2010 (Add mairix to "mail index and search tools"). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The Maildir e-mail format is a common way of storing e-mail messages, where each message is kept in a separate file with a unique name, and each folder is a directory. The local filesystem handles file locking as messages are added, moved and deleted.

Specifications

A Maildir directory (often named Maildir) usually has three subdirectories named tmp, new, and cur.

Maildir

The original Maildir specification was written by Daniel J. Bernstein, the author of qmail, djbdns, and other software.[1] Although the original specification was written specifically for Bernstein's qmail, it is general enough to be implemented in many programs. Over time and many independent implementations, a small number of shortcomings have been discovered.[citation needed]

Maildir++

Sam Varshavchik, the author of the Courier Mail Server and other software, wrote an extension[2] to the Maildir format called Maildir++ to support subfolders and mail quotas. Maildir++ directories contain subdirectories with names that start with a '.' (dot) that are also Maildir++ folders. This extension is therefore a violation of the Maildir specification, which provides an exhaustive list of the possible contents of a Maildir, however it is a compatible violation and other Maildir software supports Maildir++.

Problem space addressed by Maildir

Mail needs to be stored in these circumstances:

  • By an SMTP MTA, after receiving from a remote mail server and while it is waiting to be delivered elsewhere. The storage area used by the MTA is often called a spool.
  • By an IMAP mailstore, which serves email to mail client software (MUAs).
  • In a local user account where the user can read email using an MUA that reads the mail data directly rather than via a network protocol.
  • In other storage and processing situations, such as when filtering spam.

RFC 822 and related standards define email messages to consist of lines of text, with strict rules concerning the first lines of text. This matches the idea of a file very well. Maildir, with its one file per message design, matches precisely what can be seen by watching SMTP email transiting a network by means of protocols such as SMTP. An MTA typically processes batches of email in a sequential access manner, so again message-per-file is a good match.

A directory containing many files each containing one message is not sufficient on its own for a mailstore or other circumstance requiring random access to email. Many implementors use a database because it is designed for indexing and searching. In 2007, filesystems usually give much better access times than databases,[citation needed] so the questions facing implementors come down to indexing methods and programming convenience versus speed, efficiency, reuse of existing technology and reliability. The Cyrus IMAP server, the MH Message Handling System, the Dovecot IMAP server and the UW IMAP server all have private, mutually incompatible file-per-message storage formats with associated indexing schemes. (Dovecot and UW IMAP also implement formats that can be accessed by other software.)

Technical operation

The process that delivers an e-mail message writes it to a file in the tmp directory with a unique filename. The current algorithm for generating the unique filename combines the time, the host name, and a number of pseudo-random parameters to ensure uniqueness.[1]

The delivery process stores the message in the maildir by creating and writing to tmp/unique, and then moving this file to new/unique. The moving is commonly done by hard linking the file to new and then unlinking the file from tmp, but some implementations simply rename() it there. This sequence guarantees that a maildir-reading program will not see a partially-written message, as MUAs never look in tmp.

When the mail user agent process finds messages in the new directory it moves them to cur (using rename() - link then unlink strategy may result in having the message duplicated) and appends an informational suffix to the filename before reading them. The information suffix consists of a colon (to separate the unique part of the filename from the actual information), a '2', a comma and various flags. The '2' specifies, loosely speaking, the version of the information that follows the comma. '2' is the only currently officially specified version, '1' being an experimental version. One can only assume that it was used while the Maildir format was under development. The specification defines flags which show whether the message has been read, deleted and so on: the initial (capital) letter of Passed, Replied, Seen, Trashed, Draft, and Flagged[1]. Dovecot uses lowercase letters to match 26 IMAP keywords,[3] which may include standardised keywords such as $MDNSent, and user defined flags.

Technical issues

Inconsistent state with lockless operation

Daniel J. Bernstein designed Maildir to be safely writable by multiple concurrent writers without any form of locking, even over NFS. This works reasonably well in practice, but can result in strangeness. During directory listing, any files that are renamed after the first readdir() system call and before the last readdir() call may not appear in the listing. This causes the listing process to believe that the message was deleted, while in reality only its flags were changed. When the process lists the messages again, the "deleted" message suddenly reappears. Some mail-accessing programs layer their own locking on top of Maildir in an attempt to prevent this kind of problem. Dovecot, for example, uses its own non-standard locking with Maildir.

Mac OS X with HFS Plus (but not with ZFS) appears to avoid this problem for some reason[citation needed]. This issue can also be avoided with Linux by listening for changes in the Maildir with inotify and, after readdir()-ing, check whether inotify reports any new files[citation needed].

Windows compatibility

The Maildir standard cannot be implemented without modification on systems running Microsoft Windows, which does not accept colons in filenames. Software on Windows can use an alternative separator (such as ";", or "-") however there is currently no agreement on what character this should be. One Windows program may write Maildir files that are unreadable to another Windows program. There are programs that support Maildir written in languages such as Python and Perl, or which have been ported from Unix using Cygwin or other systems that could function reliably together if this issue were addressed.

Software that supports Maildir directly

Mail servers

Delivery agents

Mail readers

Mail index and search tools

  • Beagle (software) can index Maildirs and many other information storage formats
  • Mairix is a program for indexing and searching email messages stored in maildir, MH or mbox folders

Software that supports Maildir by implication

The list of software that can be used with Maildir is in fact much larger if you consider how this software can be plugged together, and the role of network access protocols.

For example:

  • The Sendmail MTA does not support any mail delivery format (although many assume that it does). Sendmail uses a separate delivery process called mail.local. Procmail (and other programs that support Maildir) can be used in place of mail.local, so Sendmail can rightly be said to support Maildir as much as it supports any other format.
  • Many mail readers do not support Maildir but do support remote access formats such as IMAP. Since there are several IMAP mail stores that support Maildir, any mail reader that supports IMAP such as Microsoft Outlook, Pine, or Mozilla Thunderbird can be used to access Maildir folders.
  • Fetchmail does not support Maildir (or any local delivery format) but since it talks to an SMTP server or local delivery agent, any of those listed above can be used to deliver mail from Fetchmail to Maildirs.

Notes and references

  1. ^ a b c Daniel J. Bernstein. (1995) Using maildir format (the original specification)
  2. ^ Varshavchik, Sam (1998) Maildir++ and Maildir quotas which has the Maildir++ specification buried within it
  3. ^ Dovecot Wiki: maildir format

See also