|This article needs additional citations for verification. (March 2011)|
Greylisting is a method of defending e-mail users against spam. A mail transfer agent (MTA) using greylisting will "temporarily reject" any email from a sender it does not recognize. If the mail is legitimate the originating server will, after a delay, try again and, if sufficient time has elapsed, the email will be accepted.
How it works
A server employing greylisting deliberately degrades mail service for unknown or suspect sources, over a short period of time. Typically, it records three pieces of data, known as a "triplet", for each incoming mail message:
- The IP address of the connecting host
- The envelope sender address
- The envelope recipient address(es), or just the first of them.
This data is registered on the mail server's internal database, along with the time-stamp of its first appearance. The email message will be dismissed with a temporary error until the configured period of time is elapsed, usually some minutes or a small number of hours. Temporary errors are defined in the Simple Mail Transfer Protocol (SMTP) as 4xx reply codes: Fully capable SMTP implementations are expected to maintain queues for retrying message transmissions in such cases. When a sender has proven itself able to properly retry delivery, it will be whitelisted for a longer period of time, so that future delivery attempts will be unimpeded. For example, a greylister can require a successful delivery attempt against a registered triplet to be no earlier than 25 minutes after registration and not later than 4 hours after it. Repeated delivery attempts before the 25 minute period will be ignored with the same 4xx reply code. After 4 hours the triplet will be expired, so delivery attempts will register anew. When the greylister sees an attempt within the 25 minute - 4 hour window, the connecting host will be whitelisted for 36 days.
The temporary rejection can be issued at different stages of the SMTP dialogue, allowing an implementation to store more or less data about the incoming message. The trade-off is more work and bandwidth for more exact matching of retries with original messages. Rejecting a message after its content has been received allows the server to store a choice of headers and/or a hash of the message body.
In addition to whitelisting good senders, a greylister can provide for exceptions. Greylisting can generally be overridden by a fully validated TLS connection with a matching certificate. Because large senders often have a pool of machines that can send (and resend) email, IP addresses that have the most-significant 24 bits (/24) the same are treated as equivalent, or in some cases SPF records are used to determine the sending pool. Similarly, some e-mail systems use unique per-message return-paths, for example variable envelope return path (VERP) for mailing lists, Sender Rewriting Scheme for forwarded e-mail, Bounce Address Tag Validation for backscatter protection, etc. If an exact match on the sender address is required, every e-mail from such systems will be delayed. Some greylisting systems try to avoid this delay by eliminating the variable parts of the VERP by using only the sender domain and the beginning of the local-part of the sender address.
Why it works
Greylisting is effective because many mass email tools used by spammers do not queue and reattempt mail delivery as is normal for a regular Mail Transport Agent. They do not queue and resend email as this requires the expenditure of resources (spamming normally operates on very narrow margins). Since the advent of greylisting spammers have taken to rerunning their mail delivery tools to resend the same email again without having to expend resources queueing the email. This approach still requires them to expend additional resources resending the email the second time though. As of 2011 spammers have been using this technique for many years.
Delaying delivery also gives real-time blackhole lists and similar lists time to identify and flag the spam source. Thus, these subsequent attempts are more likely to be detected as spam by other mechanisms than they were before the greylisting delay.
The main advantage from the users' point of view is that greylisting requires no additional configuration from their end. If the server utilizing greylisting is configured appropriately, the end user will only notice a delay on the first message from a given sender, so long as the sending email server is identified as belonging to the same whitelisted group as earlier messages. If mail from the same sender is repeatedly greylisted it may be worth contacting the mail system administrator with detailed headers of delayed mail.
From a mail administrator's point of view the benefit is twofold. Greylisting takes minimal configuration to get up and running with occasional modifications of any local whitelists. The second benefit is that rejecting email with a temporary 451 error (actual error code is implementation dependent) is very cheap in system resources. Most spam filtering tools are very intensive users of CPU and memory. By stopping spam before it hits filtering processes, far fewer system resources are used. This allows more layers of spam filtering or higher throughput since greylisting can easily be configured as a first line of defense with a heuristic filter such as SpamAssassin handling messages that go through.
Greylisting is particularly effective in many cases at weeding out misconfigured MTAs, and is gaining in popularity as a very effective anti-spam tool. It is likely that those MTAs that do not correctly handle greylisting will become less numerous as greylisting spreads.
Some greylisting packages support a SQL backend which allows for a distributed multiple-server frontend to be deployed with the same greylisting data on all frontends.
Delayed delivery and its consequences
The biggest disadvantage of greylisting is that for unrecognized servers, it destroys the near-instantaneous nature of email that users have come to expect. Mail from unrecognized servers is typically delayed by about 15 minutes, and could be delayed up to a few days. A customer of a greylisting ISP can not always rely on getting every email in a pre-determined amount of time. This disadvantage is mitigated by the fact that near instantaneous mail delivery is restored once a server has been recognized and is generally maintained automatically so long as users continue exchange messages. However, this disadvantage is especially visible when a user of greylisting mailserver attempts to reset his credentials to a website that uses email confirmation of password resets. In extreme cases the delivery delay imposed by the greylister can exceed the expiry time of the password reset token delivered in email. In these cases manual intervention may be required to whitelist the websites mailserver so the email containing the reset token can be used before it expires.
Sendmail, one of (if not the most) prolific internet message transport agent has a default retry interval of 15 minutes. Generally this is the maximum amount of time an email will be delayed. Experienced system admins for email systems should tune their mail system settings to sensible values, and the biggest delays from greylisting systems are incurred when communicating with poorly configured sending systems with retry intervals left set at several hours or more.
The original specification for email states that it is not a guaranteed delivery mechanism and not an instantaneous delivery mechanism. This means that greylisting is a perfectly legitimate process and does not break any protocols or rules. Explaining this to users that have become accustomed to immediate email delivery will probably not convince them that a mail server that uses greylisting is behaving correctly.
Modern greylisting applications (such as Postgrey) automatically whitelist senders that prove themselves capable of recovering from temporary errors. Note that this is irrespective of the reputed spamminess of the sender.
When a mail server is greylisted, the duration of time between the initial delay and the re-transmission is variable. Some mail servers use a default of four hours, though most will retry sooner. Most open-source MTAs have retry rules set to attempt delivery after around fifteen minutes (Sendmail default is 0, 15, ..., Exim default is 0, 15, ..., Postfix default is 0, 16.6, ..., Qmail default is 0, 6:40, 26:40, ..., Courier default is 0, 5, 10, 15, 30, 35, 40, 70, 75, 80,... Microsoft Exchange defaults to 0, 1, 2, 22, 42, 62 ..., Message Systems Momentum defaults to 0, 20, 60, 100, 180, ...). Indeed, SMTP says the retry interval should be at least 30 minutes, while the give-up time needs to be at least 4–5 days.
Greylisting delays much of the mail from non-whitelisted mail servers —not just spam— until typical patterns of communication are recorded by the greylisting system. For best results, whitelisting should be used extensively. A static list of public servers worth being whitelisted can be found in the greylisting.org repository, though this is significantly out-of-date.
Greylisting can be a particular nuisance with websites that require an account to be created and the email address confirmed before they can be used. If the sending MTA of the site is poorly configured, greylisting may delay the initial email containing the signup confirmation link, thus introducing a waiting period even though the actual website may have attempted to send out the email confirmation code immediately. Almost all stock-configured Sendmail MTAs (sendmail being the most widely deployed MTA on the internet) will retry after a few minutes, leading to typical delays of under 10 minutes in most cases (still dependent on the greylisting configuration).
On a technical level, some misbehaving SMTP senders may interpret the temporary rejection as a permanent failure. Old clients conforming only to the obsolete specification (RFC 821) and ignoring its recommendations may give up on delivery after the first failed attempt: RFC 821 states that clients "should" retry messages rather than using the word "must". RFC 2119 dictates that "should" means recommended and to ignore at your own risk, and it is a violation of the current SMTP standard for the client to fail to retry. The current SMTP specification (RFC 5321) clearly states that "the SMTP client retains responsibility for delivery of that message" (section 4.2.5) and "mail that cannot be transmitted immediately MUST be queued and periodically retried by the sender." (section 220.127.116.11).
This problem can affect SMTP clients in unexpected ways. Most MTAs will queue and retry messages, but a small number do not. A similar concern exists for applications which act as SMTP clients and fail to incorporate any form of queueing for deferred SMTP mail. This can be mitigated on the sending side by configuring the application to use a local SMTP server as an outbound queue, instead of attempting direct delivery. For the server operator who uses greylisting, clients which are known to fail on temporary errors can be supported by whitelisting or exception lists.
Some MTAs, upon encountering the temporary failure message from a greylisting server on the first attempt, will send a warning message back to the original sender of the message. The warning message is not a bounce message, but it is often formatted similarly to one and reads like one. This practice often causes the sender to believe that the message has not been delivered, when in fact the message will be delivered successfully at a later time.
Also, legitimate mail might not get delivered if the retry comes from a different IP address than the original attempt. When the source of an email is a server farm or goes out through some other kind of relay service, it is likely that a server other than the original one will make the next attempt. For network fault tolerance, their IPs can belong to completely unrelated address blocks, thereby defying the simple technique of identifying the most significant part of the address. Since the IP addresses will be different, the recipient's server will fail to recognize that a series of attempts are related, and refuse each of them in turn. This can continue until the message ages out of the queue if the number of servers is large enough. This problem can partially be bypassed by proactively identifying as exceptions such server farms. Likewise, exception have to be configured for multihomed hosts and hosts using DHCP. In the extreme case, a sender could (legitimately) use a different IPv6 address for each outbound SMTP connection.
A sender subjected to greylisting might move to a backup server and reattempt delivery. In order for greylisting to work in such cases, all backup mail servers (as specified by lower-priority MX records for the domain) should implement the same greylisting policy and share the same database. Traffic to those backup servers increases merely as a result of greylisting.
- Murray Kucherawy; Dave Crocker (June 2012). Email Greylisting: An Applicability Statement for SMTP. IETF. RFC 6647. https://tools.ietf.org/html/rfc6647. Retrieved 1 November 2012.
- John Klensin (October 2008). Simple Mail Transfer Protocol. IETF. RFC 5321. https://tools.ietf.org/html/rfc5321. Retrieved 1 November 2012.
- John Levine (2005). "Experiences with Greylisting". Second Conference on Email and Anti-Spam.
- David Schweikert. "Postgrey - Postfix Greylisting Policy Server". Retrieved 1 November 2012. "Clients which repeatedly show to be able to pass the greylist, are entered in a "clients whitelist", for which no greylisting is done anymore."
- "Filtering Spam: Combined techniques give best results". Shamrock Software GmbH. December 2007. Retrieved 2008-01-09.
- "WebSVN greylisting log". Retrieved 22 November 2012. "Age=2455 days old"
- Evan Harris (21 August 2003). "The Next Step in the Spam Control War: Greylisting". PureMagic Software. Retrieved 2008-01-09.
- Greylisting.org: Repository of greylist info
- A greylisting whitepaper by Evan Harris
- A greylisting implementation for netqmail
- Microsoft Exchange Greylisting Problems - Newsgroup Article
- RFC 6647 of the Internet Engineering Task Force, June 2012: Standardizes the current state of the art