International email

From Wikipedia, the free encyclopedia
Jump to: navigation, search

International email (IDN email or Intl email) is email that contains international, UTF-8 encoded, characters (characters which do not exist in the ASCII character set) in the email header. The most significant aspect of this is the allowance of email addresses (also known as email identities) in any language, at both interface and transport levels.

International email address[edit]

Traditional email addresses are limited to characters from the ASCII character set.[1] Therefore, it is impossible to use international Unicode UTF-8 characters in a traditional email address. This confines every email address in the world to characters from the English alphabet and a few other special characters such as those from the set: { !, #, $, %, &, ', *, +, -, /, =, ?, ^, _, `, ., {, |, }, ~, }. A rather awkward scenario for non-English speaking people.

The following are valid traditional email addresses:

  Abc@example.com                                (English, ASCII)
  Abc.123@example.com                            (English, ASCII)
  user+mailbox/department=shipping@example.com   (English, ASCII)
  !#$%&'*+-/=?^_`.{|}~@example.com               (English, ASCII)
  "Abc@def"@example.com                          (English, ASCII)
  "Fred Bloggs"@example.com                      (English, ASCII)
  "Joe.\\Blow"@example.com                       (English, ASCII)

With International email however—since it uses Unicode UTF-8 for encoding the text in addresses and in headers—composition and transportation of email to addresses in any language is possible.[2] The following are all valid international email addresses:

   伊昭傑@郵件.商務                                 (Chinese, Unicode)
   राम@मोहन.ईन्फो                                   (Hindi, Unicode)
   юзер@екзампл.ком                              (Ukrainian, Unicode)
   θσερ@εχαμπλε.ψομ                              (Greek, Unicode)

Traditional email addresses and identity[edit]

Imagine a native Russian speaker who doesn't know any English. The Russian language is written in Cyrillic script:

   а б в г д е ё ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я

Using an identifier composed of characters from this alphabet, or script, would be far more natural to the native Russian speaker than using characters from the English alphabet. A Russian might wish to use дерек@екзампил.ком as their identifier; however, since traditional email identifiers are confined to English script characters the Russian is forced to use another identifier in the awkward [for him/her] English form. In other words, the Russian might be forced to use a Roman transcription of their native Russian identifier such as derek@example.com or even some other completely unrelated Roman identifier instead.

As a result either email users are forced to identify themselves using potentially non-native language scripts (i.e. as only Roman script characters are traditionally allowed) or programmers of email systems must compensate for this by converting identifiers from their non-English scripts to English scripts and back again, using sophisticated, and unconventional conversion processes, at the user interface layer.

UTF-8 headers[edit]

Although, the traditional format for email headers allows non-ASCII characters to be included in the value portion of the header using the MIME encoded word, the process for including such characters requires extra processing of the header to convert the data to and from its MIME encoded word representation. Including international characters in these fields using UTF-8 encoding eliminates this extra processing and also the need to transmit additional charset information as will be assumed UTF-8 encoding implicitly.

Interoperability via downgrading[edit]

Since traditional email standards constrain all email header values to ASCII only characters, it is possible that the presence of UTF-8 characters in email headers would decrease the stability and reliability of transporting such email. This is because some email servers, do not support these characters. This is becoming less and less the case as of 2014 and IDN (internationalized domain name) with the UTF-8 characters is taking over. There is a proposed method by members of the IETF, by which email can be downgraded into the "legacy" all ASCII format which all standard email servers should support. This downgrade mechanism fulfills the requirement that email transport be as robust and reliable as possible and backward compatible.

Origin[edit]

The Email Address Internationalization (EAI) working group of the Internet Engineering Task Force (IETF) is currently finalizing internet drafts for International email. These drafts specify changes to the current format of email messages and the email communication protocols used for transporting these messages. These changes affect the way email messages are actually delivered from sender to recipient. Once finalized, these drafts will become part of the internet in the form of Internet RFC standards.

Protocol extensions[edit]

SMTP[edit]

SMTP stands for Simple Mail Transfer Protocol which is the internet standard for electronic mail transmission commonly called email. See full article at SMTP.

POP[edit]

Post Office Protocol (POP) was the early standard for receiving internet mail (email). See full article at Post Office Protocol.

IMAP[edit]

Internet Message Access Protocol (IMAP) is the more common internet mail transport, see full article at Internet Message Access Protocol.

See also[edit]

References[edit]

  1. ^ RFC 2088: Internet Message Format
  2. ^ RFC 4952: Overview and Framework for Internationalized Email

Bibliography[edit]

External links[edit]