DomainKeys Identified Mail
|Internet protocol suite|
DomainKeys Identified Mail (DKIM) is an email authentication method designed to detect email spoofing. It allows the receiver to check that an email claimed to have come from a specific domain was indeed authorized by the owner of that domain. It is intended to prevent forged sender addresses in emails, a technique often used in phishing and email spam.
In technical terms, DKIM lets a domain associate its name with an email message by affixing a digital signature to it. Verification is carried out using the signer's public key published in the DNS. A valid signature guarantees that some parts of the email (possibly including attachments) have not been modified since the signature was affixed. Usually, DKIM signatures are not visible to end-users, and are affixed or verified by the infrastructure rather than message's authors and recipients. In that respect, DKIM differs from end-to-end digital signatures.
- 1 History
- 2 Overview
- 3 Advantages
- 4 Weaknesses
- 5 See also
- 6 References
- 7 Further reading
- 8 External links
DKIM resulted in 2004 from merging two similar efforts, "enhanced DomainKeys" from Yahoo and "Identified Internet Mail" from Cisco. This merged specification has been the basis for a series of IETF standards-track specifications and support documents which eventually resulted in STD 76, currently RFC 6376. "Identified Internet Mail" was proposed by Cisco as a signature-based mail authentication standard, while DomainKeys was designed by Yahoo to verify the DNS domain of an e-mail sender and the message integrity.
Aspects of DomainKeys, along with parts of Identified Internet Mail, were combined to create DomainKeys Identified Mail (DKIM). Trendsetting providers implementing DKIM include Yahoo, Gmail, AOL and FastMail. Any mail from these organizations should carry a DKIM signature.
DKIM became one of the two pillars on which DMARC was founded, in the early 2010s. Discussions about DKIM signatures passing through indirect mail flows, formally in the DMARC working group, took place right after the first adoptions of the new protocol wreaked havoc on regular mailing list use. However, none of the proposed DKIM changes passed. Instead, mailing list software was changed.
In 2017, another working group was launched, DKIM Crypto Update (dcrup), with the specific restriction to review signing techniques. RFC 8301 was issued in January 2018. It bans SHA-1 and updates key sizes (from 512-2048 to 1024-4096). RFC 8463 was issued in September 2018. It adds an elliptic curve algorithm to the existing RSA. The added key type,
k=ed25519 is adequately strong while featuring short public keys, more easily publishable in DNS.
The original DomainKeys was designed by Mark Delany of Yahoo! and enhanced through comments from many others since 2004. It is specified in Historic RFC 4870, superseded by Standards Track RFC 4871, DomainKeys Identified Mail (DKIM) Signatures; both published in May 2007. A number of clarifications and conceptualizations were collected thereafter and specified in RFC 5672, August 2009, in the form of corrections to the existing specification. In September 2011, RFC 6376 merged and updated the latter two documents, while preserving the substance of the DKIM protocol. Public key compatibility with the earlier DomainKeys is also possible.
DKIM was initially produced by an informal industry consortium and was then submitted for enhancement and standardization by the IETF DKIM Working Group, chaired by Barry Leiba and Stephen Farrell, with Eric Allman of sendmail, Jon Callas of PGP Corporation, Mark Delany and Miles Libbey of Yahoo!, and Jim Fenton and Michael Thomas of Cisco Systems attributed as primary authors.
DKIM provides for two distinct operations, signing and verifying. Either of them can be handled by a module of a mail transfer agent (MTA). The signing organization can be a direct handler of the message, such as the author, the submission site or a further intermediary along the transit path, or an indirect handler such as an independent service that is providing assistance to a direct handler. Signing modules insert one or more
DKIM-Signature: header fields, possibly on behalf of the author organization or the originating service provider. Verifying modules typically act on behalf of the receiver organization, possibly at each hop.
The need for this type of validated identification arose because spam often has forged addresses and content. For example, a spam message may claim to be from firstname.lastname@example.org, although it is not actually from that address or domain or entity, and the spammer's goal is to convince the recipient to accept and to read the email. It is difficult for recipients to establish whether to trust or distrust any particular message or even domain, and system administrators may have to deal with complaints about spam that appears to have originated from their systems but did not. DKIM specification allows signers to choose which header fields they sign, but the
From: field must always be signed.
DKIM allows the signer (author organization) to communicate which emails it considers legitimate. It does not directly prevent or disclose abusive behavior. This ability to distinguish legitimate mail from potentially forged mail has benefits for recipients of e-mail as well as senders.
DKIM is independent of Simple Mail Transfer Protocol (SMTP) routing aspects in that it operates on the RFC 5322 message—the transported mail's header and body—not the SMTP envelope defined in RFC 5321. Hence the DKIM signature survives basic relaying across multiple MTAs.
DKIM-Signature header field consists of a list of
tag=value parts. Tags are short, usually only one or two letters. The most relevant ones are b for the actual digital signature of the contents (headers and body) of the mail message, bh for the body hash, d for the signing domain, and s for the selector. The default parameters for the authentication mechanism are to use SHA-256 as the cryptographic hash and RSA as the public key encryption scheme, and encode the encrypted hash using Base64.
Both header and body contribute to the signature. First, the message body is hashed, always from the beginning, possibly truncated at a given length (which may be zero). Second, selected header fields are hashed, in the order given by h. Repeated field names are matched from the bottom of the header upward, which is the order in which
Received: fields are inserted in the header. A non-existing field matches the empty string, so that adding a field with that name will break the signature. The
DKIM-Signature: field of the signature being created, with bh equal to the computed body hash and b equal to the empty string, is implicitly added to the second hash, albeit its name must not appear in h —if it does, it refers to another, preexisting signature. For both hashes, text is canonicalized according to the relevant c algorithms. The result, after encryption with the private key and encoding using Base64, is b. Algorithms, fields, and body length are meant to be chosen so as to assure unambiguous message identification while still allowing signatures to survive the unavoidable changes which are going to occur in transit. No data integrity is implied.
The receiving SMTP server uses the domain name and the selector to perform a DNS lookup. For example, given the signature
DKIM-Signature: v=1; a=rsa-sha256; d=example.net; s=brisbane; c=relaxed/simple; q=dns/txt; l=1234; t=1117574938; x=1118006938; h=from:to:subject:date:keywords:keywords; bh=MTIzNDU2Nzg5MDEyMzQ1Njc4OTAxMjM0NTY3ODkwMTI=; b=dzdVyOfAKCdLXdJOc9G2q8LoXSlEniSbav+yuU4zGeeruD00lszZ VoG4ZHRNiYzR
A verifier queries the TXT resource record type of
example.net is the author domain to be verified against (in the d field),
brisbane is a selector given in the s field while
_domainkey is a fixed part of the protocol. There are no CAs nor revocation lists involved in DKIM key management, and the selector is a straightforward method to allow signers to add and remove keys whenever they wish—long lasting signatures for archival purposes are outside DKIM's scope. Some more tags are visible in the example:
- v is the version,
- a is the signing algorithm,
- d is the domain,
- s is the selector,
- c is the canonicalization algorithm(s) for header and body,
- q is the default query method,
- l is the length of the canonicalized part of the body that has been signed,
- t is the signature timestamp,
- x is its expire time, and
- h is the list of signed header fields, repeated for fields that occur multiple times.
The data returned from the query is also a list of tag-value pairs. It includes the domain's public key, along with other key usage tokens and flags. The receiver can use this to then decrypt the hash value in the header field and at the same time recalculate the hash value for the mail message (headers and body) that was received. If the two values match, this cryptographically proves that the mail was signed by the indicated domain and has not been tampered with in transit.
Signature verification failure does not force rejection of the message. Instead, the precise reasons why the authenticity of the message could not be proven should be made available to downstream and upstream processes. Methods for doing so may include sending back an FBL message, or adding an Authentication-Results header field to the message as described in RFC 7001.
DomainKeys is covered by U.S. Patent 6,986,049 assigned to Yahoo! Inc. For the purpose of the DKIM IETF Working Group, Yahoo! released the now obsolete DK library under a dual license scheme: the DomainKeys Patent License Agreement v1.2, an unsigned version of which can still be found, and GNU General Public License v2.0 (and no other version).
The primary advantage of this system for e-mail recipients is in allowing the signing domain to reliably identify a stream of legitimate email, thereby allowing domain-based blacklists and whitelists to be more effective. This is also likely to make certain kinds of phishing attacks easier to detect.
There are some incentives for mail senders to sign outgoing e-mail:
- It allows a great reduction in abuse desk work for DKIM-enabled domains if e-mail receivers use the DKIM system to identify forged e-mail messages claiming to be from that domain.
- The domain owner can then focus its abuse team energies on its own users who actually are making inappropriate use of that domain.
Use with spam filtering
DKIM is a method of labeling a message, and it does not itself filter or identify spam. However, widespread use of DKIM can prevent spammers from forging the source address of their messages, a technique they commonly employ today. If spammers are forced to show a correct source domain, other filtering techniques can work more effectively. In particular, the source domain can feed into a reputation system to better identify spam. Conversely, DKIM can make it easier to identify mail that is known not to be spam and need not be filtered. If a receiving system has a whitelist of known good sending domains, either locally maintained or from third party certifiers, it can skip the filtering on signed mail from those domains, and perhaps filter the remaining mail more aggressively.
DKIM can be useful as an anti-phishing technology. Mailers in heavily phished domains can sign their mail to show that it is genuine. Recipients can take the absence of a valid signature on mail from those domains to be an indication that the mail is probably forged. The best way to determine the set of domains that merit this degree of scrutiny remains an open question. DKIM used to have an optional feature called ADSP that lets authors that sign all their mail self-identify, but it was demoted to historic status in November 2013. Instead, DMARC can be used for the same purpose and allows domains to self-publish which techniques (including SPF and DKIM) they employ, which makes it easier for the receiver to make an informed decision whether a certain mail is spam or not. Using DMARC, Gmail for example rejects all emails from eBay and PayPal that are not authenticated.
Because it is implemented using DNS records and an added RFC 5322 header field, DKIM is compatible with the existing e-mail infrastructure. In particular, it is transparent to existing e-mail systems that lack DKIM support.
DKIM requires cryptographic checksums to be generated for each message sent through a mail server, which results in computational overhead not otherwise required for e-mail delivery. This additional computational overhead is a hallmark of digital postmarks, making sending bulk spam more (computationally) expensive. This facet of DKIM may look similar to hashcash, except that the receiver side verification is not a negligible amount of work, and a typical hashcash algorithm would require far more work.
DKIM's non-repudiation feature prevents senders (such as spammers) from credibly denying having sent an email. It has proven useful to news media sources such as WikiLeaks, which has been able to leverage DKIM body signatures to prove that leaked emails were genuine and not tampered with, definitively repudiating claims by Hillary Clinton's 2016 US Presidential Election running mate Tim Kaine, and DNC Chair Donna Brazile.
The RFC itself identifies a number of potential attack vectors.
DKIM signatures do not encompass the message envelope, which holds the return-path and message recipients. Since DKIM does not attempt to protect against mis-addressing, this does not affect its utility.
A concern for any cryptographic solution would be message replay abuse, which bypasses techniques that currently limit the level of abuse from larger domains.[clarification needed] Replay can be inferred by using per-message public keys, tracking the DNS queries for those keys and filtering out the high number of queries due to e-mail being sent to large mailing lists or malicious queries by bad actors.
For a comparison of different methods also addressing this problem see e-mail authentication.
As mentioned above, authentication is not the same as abuse prevention. An evil email user of a reputable domain can compose a bad message and have it DKIM-signed and sent from that domain to any mailbox from where they can retrieve it as a file, so as to obtain a signed copy of the message. Use of the l tag in signatures makes doctoring such messages even easier. The signed copy can then be forwarded to a million recipients, for example through a botnet, without control. The email provider who signed the message can block the offending user, but cannot stop the diffusion of already-signed messages. The validity of signatures in such messages can be limited by always including an expiration time tag in signatures, or by revoking a public key periodically or upon a notification of an incident. Effectiveness of the scenario can hardly be limited by filtering outgoing mail, as that implies the ability to detect if a message might potentially be useful to spammers.
DKIM currently features two canonicalization algorithms, simple and relaxed, neither of which is MIME-aware. Mail servers can legitimately convert to a different character set, and often document this with X-MIME-Autoconverted header fields. In addition, servers in certain circumstances have to rewrite the MIME structure, thereby altering the preamble, the epilogue, and entity boundaries, any of which breaks DKIM signatures. Only plain text messages written in us-ascii, provided that MIME header fields are not signed, enjoy the robustness that end-to-end integrity requires.
The OpenDKIM Project organized a data collection involving 21 mail servers and millions of messages. 92.3% of observed signatures were successfully verified, a success rate that drops slightly (90.5%) when only mailing list traffic is considered.
Annotations by mailing lists
The problems might be exacerbated when filtering or relaying software makes changes to a message. Without specific precaution implemented by the sender, the footer addition operated by most mailing lists and many central antivirus solutions will break the DKIM signature. A possible mitigation is to sign only designated number of bytes of the message body. It is indicated by l tag in DKIM-Signature header. Anything added beyond the specified length of the message body is not taken into account while calculating DKIM signature. This won't work for MIME messages.
Another workaround is to whitelist known forwarders; e.g., by SPF. For yet another workaround, it was proposed that forwarders verify the signature, modify the email, and then re-sign the message with a Sender: header. However, it should be noted that this solution has its risk with forwarded 3rd party signed messages received at SMTP receivers supporting the RFC 5617 ADSP protocol. Thus, in practice, the receiving server still has to whitelist known message streams.
Short key vulnerability
In October 2012, Wired reported that mathematician Zach Harris detected and demonstrated an email source spoofing vulnerability with short DKIM keys for the
google.com corporate domain, as well as several other high-profile domains. He stated that authentication with 384-bit keys can be factored in as little as 24 hours "on my laptop," and 512-bit keys, in about 72 hours with cloud computing resources. Harris found that many organizations sign email with such short keys; he factored them all and notified the organizations of the vulnerability. He states that 768-bit keys could be factored with access to very large amounts of computing power, so he suggests that DKIM signing should use key lengths greater than 1,024. Wired stated that Harris reported, and Google confirmed, that they began using new longer keys soon after his disclosure. According to RFC 6376 the receiving party must be able to validate signatures with keys ranging from 512 bits to 2048 bits, thus usage of keys shorter than 512 bits might be incompatible and shall be avoided. The RFC 6376 also states that signers must use keys of at least 1024 bits for long-lived keys, though long-livingness is not specified there.
- Authenticated Received Chain (ARC)
- Author Domain Signing Practices
- Context filtering
- Domain-based Message Authentication, Reporting and Conformance (DMARC)
- E-mail authentication
- Sender Policy Framework
- Vouch by Reference
- Tony Hansen; Dave Crocker; Phillip Hallam-Baker (July 2009). DomainKeys Identified Mail (DKIM) Service Overview. IETF. doi:10.17487/RFC5585. RFC 5585. https://tools.ietf.org/html/rfc5585. Retrieved 6 January 2016. "Receivers who successfully verify a signature can use information about the signer as part of a program to limit spam, spoofing, phishing, or other undesirable behaviors. DKIM does not, itself, prescribe any specific actions by the recipient; rather, it is an enabling technology for services that do."
- Dave Crocker; Tony Hansen; Murray S. Kucherawy, eds. (September 2011). "Data Integrity". DomainKeys Identified Mail (DKIM) Signatures. IETF. sec. 1.5. doi:10.17487/RFC6376. RFC 6376. https://tools.ietf.org/html/rfc6376#section-1.5. Retrieved 6 January 2016. "Verifying the signature asserts that the hashed content has not changed since it was signed and asserts nothing else about "protecting" the end-to-end integrity of the message."
- "DKIM Frequently Asked Questions". DKIM.org. 16 October 2007. Retrieved 4 January 2016.
DKIM was produced by an industry consortium in 2004. It merged and enhanced DomainKeys, from Yahoo! and Identified Internet Mail, from Cisco.
- Jim Fenton (15 June 2009). "DomainKeys Identified Mail (DKIM) Grows Significantly". Cisco. Retrieved 28 October 2014.
- "STD 76, RFC 6376 on DomainKeys Identified Mail (DKIM) Signatures". IETF. 11 July 2013. Retrieved 12 July 2013.
RFC 6376 has been elevated to Internet Standard.
- "Identified Internet Mail: A network based message signing approach to combat email fraud". 26 April 2006. Archived from the original on 27 April 2006. Retrieved 4 January 2016.
- Jim Fenton; Michael Thomas (1 June 2004). Identified Internet Mail. IETF. I-D draft-fenton-identified-mail-00. https://tools.ietf.org/html/draft-fenton-identified-mail-00. Retrieved 6 January 2016.
- Delany, Mark (May 22, 2007). "One small step for email, one giant leap for Internet safety". Yahoo! corporate blog. Delany is credited as Chief Architect, inventor of DomainKeys.
- "May 19, 2004 Yahoo Releases Specs for DomainKeys"
- RFC 4870 ("Domain-Based Email Authentication Using Public Keys Advertised in the DNS (DomainKeys)"; obsoleted by RFC 4871).
- RFC 6376 ("DomainKeys Identified Mail (DKIM) Signatures"; obsoletes RFC 4871 and RFC 5672).
- Taylor, Brad (July 8, 2008). "Fighting phishing with eBay and Paypal". Gmail Blog.
- "I’m having trouble sending messages in Gmail". Gmail Help entry, mentioning DKIM support when sending.
- Mueller, Rob (August 13, 2009). "All outbound email now being DKIM signed". Fastmail blog.
- "History". dmarc.org.
- "DMARC Group History". IETF.
- "DKIM Crypto Update (dcrup)". IETF.
- Scott Kitterman (January 2018). Cryptographic Algorithm and Key Usage Update to DomainKeys Identified Mail (DKIM). IETF. doi:10.17487/RFC8301. RFC 8301. https://tools.ietf.org/html/rfc8301.
- John Levine (September 2018). A New Cryptographic Signature Method for DomainKeys Identified Mail (DKIM). IETF. doi:10.17487/RFC8463. RFC 8463. https://tools.ietf.org/html/rfc8463.
- Jason P. Stadtlander (16 January 2015). "Email Spoofing: Explained (and How to Protect Yourself)". HuffPost. Retrieved 11 January 2016.
- Dave Crocker; Tony Hansen; Murray S. Kucherawy, eds. (July 2009). "Determine the Header Fields to Sign". DomainKeys Identified Mail (DKIM) Signatures. IETF. sec. 5.4. doi:10.17487/RFC6376. RFC 6376. https://tools.ietf.org/html/rfc6376#section-5.4. Retrieved 6 January 2016. "The From header field MUST be signed (that is, included in the "h=" tag of the resulting DKIM-Signature header field)."
- "Yahoo! DomainKeys Patent License Agreement v1.1". SourceForge. 2006. Retrieved 2010-05-30.
Yahoo! DomainKeys Patent License Agreement v1.2
- Levine, John R. (January 25, 2010). "IPR disclosures, was Collecting re-chartering questions". ietf-dkim mailing list. Mutual Internet Practices Association. Retrieved 2010-05-30.
The reference to the GPL looks to me like it only covers the old Sourceforge DK library, which I don't think anyone uses any more. The patent, which is what's important, is covered by a separate license that Yahoo wrote.
- Chen, Andy (September 26, 2011). "Yahoo! Inc.'s Statement about IPR related to RFC 6376". IPR disclosure. IETF. Retrieved 3 October 2011.
- Falk, J.D. (March 17, 2009). "Searching for Truth in DKIM". CircleID.
- Barry Leiba (2013-11-25). "Change the status of ADSP (RFC 5617) to Historic". IETF. Retrieved 13 March 2015.
- "FAQ - DMARC Wiki".
The DMARC standard states in Section 6.7, “Policy Enforcement Considerations,” that if a DMARC policy is discovered the receiver must disregard policies advertised through other means such as SPF or ADSP.
- "Add a DMARC record - Google Apps Administrator Help".
- "About DMARC - Google Apps Administrator Help".
Your policy can be strict or relaxed. For example, eBay and PayPal publish a policy requiring all of their mail to be authenticated in order to appear in someone's inbox. In accordance with their policy, Google rejects all messages from eBay or PayPal that aren’t authenticated.
- Tony Hansen; Dave Crocker; Phillip Hallam-Baker (July 2009). DomainKeys Identified Mail (DKIM) Service Overview. IETF. doi:10.17487/RFC5585. RFC 5585. https://tools.ietf.org/html/rfc5585. Retrieved 1 July 2013.
- Roic, Alessio (July 5, 2007). "Postmarking: helping the fight against spam" Archived July 17, 2011, at the Wayback Machine.. Microsoft Office Outlook Blog.
- "DKIM Verification". www.wikileaks.org. 4 November 2016. Retrieved 7 November 2016.
- "Security considerations", ietf.org
- Jim Fenton (September 2006). "Chosen Message Replay". Analysis of Threats Motivating DomainKeys Identified Mail (DKIM). IETF. sec. 4.1.4. doi:10.17487/RFC4686. RFC 4686. https://tools.ietf.org/html/rfc4686#section-4.1.4. Retrieved 10 January 2016.
- Ned Freed (with agreement by John Klensin) (March 5, 2010). "secdir review of draft-ietf-yam-rfc1652bis-03". YAM mailing list. IETF. Retrieved 2010-05-30.
DKIM WG opted for canonical form simplicity over a canonical form that's robust in the face of encoding changes. It was their engineering choice to make and they made it.
- RFC 2045 allows a parameter value to be either a token or a quoted-string, e.g. in format="flowed" the quotes can be legally removed, which breaks DKIM signatures.
- Kucherawy, Murray (March 28, 2011). "RFC4871 Implementation Report". IETF. Retrieved 2012-02-18.
- Murray S. Kucherawy (September 2011). DomainKeys Identified Mail (DKIM) and Mailing Lists. IETF. doi:10.17487/RFC6377. RFC 6377. https://tools.ietf.org/html/rfc6377. Retrieved 10 January 2016.
- Eric Allman; Mark Delany; Jim Fenton (August 2006). "Mailing List Manager Actions". DKIM Sender Signing Practices. IETF. sec. 5.1. I-D draft-allman-dkim-ssp-02. https://tools.ietf.org/html/draft-allman-dkim-ssp-02#section-5.1. Retrieved 10 January 2016.
- Zetter, Kim (October 24, 2012). "How a Google Headhunter’s E-Mail Unraveled a Massive Net Security Hole". Wired. Accessed October 24, 2012.
- RFC 4686 Analysis of Threats Motivating DomainKeys Identified Mail (DKIM)
- RFC 4871 DomainKeys Identified Mail (DKIM) Signatures Proposed Standard
- RFC 5617 DomainKeys Identified Mail (DKIM) Author Domain Signing Practices (ADSP)
- RFC 5585 DomainKeys Identified Mail (DKIM) Service Overview
- RFC 5672 RFC 4871 DomainKeys Identified Mail (DKIM) Signatures—Update
- RFC 5863 DKIM Development, Deployment, and Operations
- RFC 6376 DomainKeys Identified Mail (DKIM) Signatures Draft Standard
- RFC 6377 DomainKeys Identified Mail (DKIM) and Mailing Lists
- RFC 8301 Cryptographic Algorithm and Key Usage Update to DomainKeys Identified Mail (DKIM)