Google Translate logo
Type of site
|Statistical machine translation|
|Available in||90+ languages, see supported languages|
|Users||Over 200+ million people.|
|Launched||2006rule-based machine translation)
October 2007 (as statistical machine translation)
Google Translate is a free multilingual statistical machine translation service provided by Google to translate text, speech, images, or real-time video from one language into another. It offers a web interface, mobile interfaces for Android and iOS, and an API that developers can use to build browser extensions, applications and other software. As of September 2015, Google Translate supports 90 languages at various levels and serves over 200 million people daily.
- 1 Features
- 2 Translation methodology
- 3 Limitations
- 4 Supported languages
- 5 Languages not yet supported by Google Translate
- 6 Open-source licenses and components
- 7 Reviews
- 8 Translate Community
- 9 See also
- 10 References
- 11 External links
For some languages, Google Translate can pronounce translated text, highlight corresponding words and phrases in the source and target text, and act as a simple dictionary for single-word input. If "Detect language" is selected, text in an unknown language can be identified.
In the web interface, users can suggest alternate translations, such as for technical terms, or correct mistakes. These suggestions are included in future updates to the translation process. If a user enters a URL in the source text, Google Translate will produce a hyperlink to a machine translation of the website. For some languages, text can be entered via an on-screen keyboard, handwriting recognition, or speech recognition. It is possible to enter searches in a source language that are first translated to a destination language allowing one to browse and interpret results from the selected destination language in the source language.
Google Translate is available in some browsers as an extension which can translate.
The application supports 90 languages and can translate 37 languages via photo, 32 via voice in conversation mode, and 27 via real-time video in augmented reality mode.
An early 2011 version supported Conversation Mode when translating between English and Spanish (in alpha testing). This interface within Google Translate allows users to communicate fluidly with a nearby person in another language. In October 2011 it was expanded to 14 languages.
The 'Camera input' functionality allows users to take a photograph of a document, signboard, etc. Google Translate recognises the text from the image using optical character recognition (OCR) technology and gives the translation. Camera input is not available for all languages.
In January 2015, the application gained the ability to translate text in real time using the device's camera, as a result of Google's acquisition of the Word Lens app. The speed and quality of real-time video translation (augmented reality) feature were further enhanced in July 2015 with the release of a new implementation that utilizes convolutional neural networks.
Google Translate is available as a free downloadable application for Android OS users. The first version was launched in January 2010. It works simply like the browser version. Google translation for Android contains two main options: "SMS translation" and "History".
The application supports 90 languages and voice input for 15 languages. It is available for devices running Android 2.1 and above and can be downloaded by searching for "Google Translate" in Google Play. It was first released in January 2010, with an improved version available on January 12, 2011.
Latest version: 2.0.0 build 42.
In August 2008, Google launched a Google Translate HTML5 web application for iOS for iPhone and iPod Touch users, and the iOS app was released on February 8, 2011. The current Google Translate app is compatible with iPhone, iPad, and iPod Touch updated to iOS 7.0+. It accepts voice input for 15 languages and allows translation of a word or phrase into one of more than 50 languages. Translations can be spoken out loud in 23 different languages.
Latest version: 4.0.0 (July 29, 2015)
On May 26, 2011, Google announced that the Google Translate API for software developers had been deprecated and would cease functioning on December 1, 2011, "due to the substantial economic burden caused by extensive abuse." Because the API was used in numerous third-party websites, this decision led some developers to criticize Google and question the viability of using Google APIs in their products. In response to public pressure, Google announced on June 3, 2011, that the API would continue to be available as a paid service.
Google Translate does not apply grammatical rules, since its algorithms are based on statistical analysis rather than traditional rule-based analysis. The system's original creator, Franz Josef Och, has criticized the effectiveness of rule-based algorithms in favor of statistical approaches. It is based on a method called statistical machine translation, and more specifically, on research by Och who won the DARPA contest for speed machine translation in 2003. Och was the head of Google's machine translation group until leaving to join Human Longevity, Inc. in July 2014.
Google Translate does not translate from one language to another (L1 → L2). Instead, it often translates first to English and then to the target language (L1 → EN → L2). However, because English, like all human languages, is ambiguous and depends on context, this can cause translation errors. For example, translating vous from French to Russian gives vous → you → ты OR Bы/вы. If Google were using an unambiguous, artificial language as the intermediary, it would be vous → you → Bы/вы OR tu → thou → ты. Such a suffixing of words disambiguates their different meanings. Hence, publishing in English, using unambiguous words, providing context, using expressions such as "you all" often make a better one-step translation.
The following languages do not have a direct Google translation to or from English. These languages are translated through the indicated intermediate language (which in all cases is closely related to the desired language but more widely spoken) in addition to through English:
- Belarusian (be ↔ ru ↔ en ↔ other);
- Catalan (ca ↔ es ↔ en ↔ other);
- Galician (gl ↔ pt ↔ en ↔ other);
- Haitian Creole (ht ↔ fr ↔ en ↔ other);
- Slovak (sk ↔ cs ↔ en ↔ other);
- Ukrainian (uk ↔ ru ↔ en ↔ other);
- Urdu (ur ↔ hi ↔ en ↔ other).
According to Och, a solid base for developing a usable statistical machine translation system for a new pair of languages from scratch would consist of a bilingual text corpus (or parallel collection) of more than 150-200 million words, and two monolingual corpora each of more than a billion words. Statistical models from these data are then used to translate between those languages.
To acquire this huge amount of linguistic data, Google used United Nations documents. The UN typically publishes documents in all six official UN languages, which has produced a very large 6-language corpus.
Google representatives have been involved with domestic conferences in Japan where Google has solicited bilingual data from researchers.
When Google Translate generates a translation, it looks for patterns in hundreds of millions of documents to help decide on the best translation. By detecting patterns in documents that have already been translated by human translators, Google Translate makes intelligent guesses (AI) as to what an appropriate translation should be.
Before October 2007, for languages other than Arabic, Chinese and Russian, Google Translate was based on SYSTRAN, a software engine which is still used by several other online translation services such as Yahoo! Babel Fish (now defunct). Since October 2007, Google Translate has used proprietary, in-house technology based on statistical machine translation instead.
Google Translate, like other automatic translation tools, has its limitations. The service limits the number of paragraphs and the range of technical terms that can be translated, and while it can help the reader understand the general content of a foreign language text, it does not always deliver accurate translations, and most times it tends to repeat verbatim the same word it's expected to translate. Grammatically, for example, Google Translate struggles to differentiate between imperfect and perfect tenses in Romance languages so habitual and continuous acts in the past often become single historical events. Although seemingly pedantic, this can often lead incorrect results (to a native speaker of for example French and Spanish) which would have been avoided by a human translator. Knowledge of the subjunctive mood is virtually non-existent. Moreover, the informal second person (tu) is often chosen, whatever the context or accepted usage. Since its English reference material contains only "you" forms, it is difficult to translate into a language which has more.
Some languages produce better results than others. Google Translate performs well especially when English is the target language and the source language is from the European Union due to the prominence of translated EU parliament notes. A 2010 analysis indicated that French to English translation is relatively accurate, and 2011 and 2012 analyses showed that Italian to English translation is relatively accurate as well. However, if the source text is shorter, rule-based machine translations often perform better; this effect is particularly evident in Chinese to English translations. While edits of translations may be submitted, in Chinese specifically one is not able to edit sentences as a whole. Instead, one must edit sometimes arbitrary sets of characters, leading to incorrect edits.
Texts written in the Greek, Devanagari, Cyrillic and Arabic scripts can be transliterated automatically from phonetic equivalents written in the Latin alphabet. The browser version of Google Translate provides the read phonetically option for Japanese to English conversion. The same option is not available on the paid API version.
Many of the more popular languages have a "text-to-speech" audio function that is able to read back a text in that language, up to a few dozen words or so. In the case of pluricentric languages, the accent depends on the region: for English, in the Americas, most of the Asia-Pacific and West Asia the audio uses a female General American accent, whereas in Europe, Hong Kong, Malaysia, Singapore, Guyana and all other parts of the world a female British English accent is used, except for a special Oceania accent used in Australia, New Zealand and Norfolk Island; for Spanish, in the Americas a Latin American Spanish accent is used, while in the other parts of the world a Castilian Spanish accent is used; Portuguese uses a São Paulo accent in the world, except for Portugal, where their native accent is used. Some less widely spoken languages use the open-source eSpeak synthesizer for their speech; producing a robotic, awkward voice that may be difficult to understand.
- 1st stage
- 2nd stage
- English to and from Portuguese
- 3rd stage
- English to and from Italian
- 4th stage
- 5th stage (launched April 28, 2006)
- English to and from Arabic
- 6th stage (launched December 16, 2006)
- English to and from Russian
- 7th stage (launched February 9, 2007)
- 8th stage (all 25 language pairs use Google's machine translation system) (launched October 22, 2007)
- 9th stage
- English to and from Hindi
- 10th stage (as of this stage, translation can be done between any two languages, using English as an intermediate step, if needed) (launched May 8, 2008)
- 11th stage (launched September 25, 2008)
- 12th stage (launched January 30, 2009)
- 13th stage (launched June 19, 2009)
- 14th stage (launched August 24, 2009)
- 15th stage (launched November 19, 2009)
- The Beta stage is finished. Users can now choose to have the romanization written for Chinese, Japanese, Korean, Russian, Ukrainian, Belarusian, Bulgarian, Greek, Hindi and Thai. For translations from Arabic, Persian and Hindi, the user can enter a Latin transliteration of the text and the text will be transliterated to the native script for these languages as the user is typing. The text can now be read by a text-to-speech program in English, Italian, French and German.
- 16th stage (launched January 30, 2010)
- 17th stage (launched April 2010)
- Speech program launched in Hindi and Spanish.
- 18th stage (launched May 5, 2010)
- Speech program launched in Afrikaans, Albanian, Catalan, Chinese (Mandarin), Croatian, Czech, Danish, Dutch, Finnish, Greek, Hungarian, Icelandic, Indonesian, Latvian, Macedonian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Swahili, Swedish, Turkish, Vietnamese and Welsh (based on eSpeak)
- 19th stage (launched May 13, 2010)
- 20th stage (launched June 2010)
- Provides romanization for Arabic.
- 21st stage (launched September 2010)
- 22nd stage (launched December 2010)
- Romanization of Arabic removed.
- Spell check added.
- For some languages, Google replaced text-to-speech synthesizers from eSpeak's robot voice to native speaker's nature voice technologies made by SVOX (Chinese, Czech, Danish, Dutch, Finnish, Greek, Hungarian, Norwegian, Polish, Portuguese, Russian, Swedish, Turkish). Also the old versions of French, German, Italian and Spanish. Latin uses the same synthesizer as Italian.
- Speech program launched in Arabic, Japanese and Korean.
- 23rd stage (launched January 2011)
- Choice of different translations for a word.
- 24th stage (Launched June 2011)
- 25th stage (launched July 2011)
- Translation rating introduced.
- 26th stage (launched January 2012)
- Dutch male voice synthesizer replaced with female.
- Elena by SVOX replaced the Slovak eSpeak voice.
- Transliteration of Yiddish added.
- 27th stage (launched February 2012)
- Speech program launched in Thai.
- 28th stage (launched September 2012)
- 29th stage (launched October 2012)
- 30th stage (launched October 2012)
- New speech program launched in English.
- 31st stage (launched November 2012)
- New speech program in French, Spanish, Italian and German.
- 32nd stage (launched March 2013)
- Phrasebook added.
- 33rd stage (launched April 2013)
- 34th stage (launched May 2013)
- 35th stage (launched May 2013)
- 16 additional languages can be used with camera-input: Bulgarian, Catalan, Danish, Estonian, Finnish, Croatian, Hungarian, Indonesian, Icelandic, Lithuanian, Latvian, Norwegian, Romanian, Slovak, Slovenian and Swedish.
- 36th stage (launched December 2013)
- 37th stage (launched June 2014)
- Definition of words added.
- 38th stage (launched December 2014)
Languages not yet supported by Google Translate
Languages not yet supported by Google Translate, but in process.
Open-source licenses and components
|Albanian||Albanet||CC-BY 3.0/GPL 3|
|Arabic||Arabic Wordnet||CC-BY-SA 3|
|Catalan||Multilingual Central Repository||CC-BY-3.0|
|French||WOLF (WOrdnet Libre du Français)||CeCILL-C|
|Galician||Multilingual Central Repository||CC-BY-3.0|
|Hindi||IIT Bombay Wordnet||Indo Wordnet|
|Persian||Persian Wordnet||Free to Use|
|Spanish||Multilingual Central Repository||CC-BY-3.0|
Shortly after launching the translation service, Google won an international competition for English–Arabic and English–Chinese machine translation.
Translation mistakes and oddities
Since Google Translate uses statistical matching to translate, translated text can often include apparently nonsensical and obvious errors, often swapping common terms for similar but nonequivalent common terms in the other language, as well as inverting sentence meaning. Also, for the speech, it uses only European French as well as Latin American Spanish worldwide, but both Portugal and Brazilian Portuguese (European for translate.google.pt and Brazilian for all other Google Translate sites).
Translate Community is a platform that is intended to improve Google Translate service. Volunteers can select up to five languages to help in better translation. Users can verify translated phrases and translate phrases in their languages to and from English, helping to improve the accuracy of translating more rare and complex phrases.
- Asia Online
- Bing Translator
- Comparison of machine translation applications
- Google Dictionary (discontinued)
- Google Text-to-Speech
- Jollo (discontinued)
- List of Google products
- Word Lens (discontinued; merged into Google Translate app)
- Yahoo! Babel Fish (discontinued; redirects to main Yahoo! site)
- Shankland, Stephen. "Google Translate now serves 200 million people daily". CNET. Retrieved 17 October 2014.
- "Inside Google Translate". Google Translate.
- Shankland, Stephen. "Google Translate now serves 200 million people daily". CNET. Retrieved 17 October 2014.
- "Google Translate Help". Google. Retrieved June 4, 2014.
- "Search Add-ons :: Add-ons for Firefox". Mozilla. Retrieved August 7, 2009.
- Google Translate by chrome.translate.extension chrome.google.com
- "Google Translate Integrated in Google Chrome 5". Ghacks.net. February 14, 2010. Retrieved December 22, 2011.
- Google Chrome 5 features an integrated Google Translate service February 15, 2010. stuff.techwhack.com
- Ariha Setalvad (2015-07-29). "Google Translate adds 20 new languages to video text translation". The Verge.
- Ryan Kim (2011-10-13). "Google Translate conversation mode expands to 14 languages". GigaOM.
- "Hallo, hola, olá to the new, more powerful Google Translate app". Google Blog.
- Barak Turovsky (2015-07-29). "See the world in your language with Google Translate". Google: Official Blog.
- Otavio Good (2015-07-29). "How Google Translate squeezes deep learning onto a phone". Google: Research Blog.
- "About Google Translate". Retrieved 13 July 2015.
- A new look for Google Translate for Android, Awaneesh Verma, Google Translate Blog, January 12, 2011
- "Google Translate on iTunes". Google. Retrieved July 29, 2015.
- Introducing the Google Translate app for iPhone, Wenzhang Zhu, Google Translate Blog, February 8, 2011
- Feldman, Adam (May 26, 2011). "Spring cleaning for some of our APIs". Google Code. Retrieved May 28, 2011.
- "Google Translate API (Deprecated)". Google Code. Archived from the original on May 28, 2011. Retrieved May 28, 2011.
- Wong, George (May 27, 2011). "Google gets rid of APIs for Translate and other services". UberGizmo. Retrieved May 28, 2011.
- Burnette, Ed (May 27, 2011). "Google pulls the rug out from under web service API developers, nixes Google Translate and 17 others". ZDNet. Retrieved May 28, 2011.
- "Google cancels plan to shutdown Translate API. To start charging for translations". June 4, 2011. Retrieved June 4, 2011.
- Och, Franz Josef (September 12, 2005), "Statistical Machine Translation: Foundations and Recent Advances" (PDF), The Tenth Machine Translation Summit (PDF), Phuket, Thailand, retrieved December 19, 2010
- "Franz Och, Ph.D., Expert in Machine Learning and Machine Translation, Joins Human Longevity, Inc. as Chief Data Scientist" (Press release). La Jolla, CA: Human Longevity, Inc. 29 July 2014. Retrieved 15 January 2015.
- French to Russian translation translates the untranslated non-French word "obvious" from pivot (intermediate) English to Russian le mot 'obvious' n'est pas français → "очевидными" слово не французское
- We pretend that this English article is German when asking Google to translate it to French. Google, because it does not find the English words in the German dictionary, leaves those words unchanged as one can show it with this spelllling misssstake. But it translates them to French nonetheless. That's because Google translates German → English → French and that the unchanged English words undergo the second translation. The word "außergewöhnlich" however will be translated twice.
- "Google Translate performs two-step translation through English" (PDF). Retrieved December 22, 2011.
- "Wrong translation to Ukrainian language because going through both Russian and English". Google.
- Google Translation mixes up "tu" and plural or polite "vous" Je vous aime. Tu es ici. You are here. → Я люблю тебя. Вы здесь. Вы здесь.
- Google seeks world of instant translations (Reuters)
- Google was an official sponsor of the annual Computational Linguistics in Japan Conference ("Gengoshorigakkai") in 2007. Google also sent a delegate to the meeting of the members of the Computational Linguistic Society of Japan in March 2005, promising funding to researchers who would be willing to share text data.
- Google Switches to its Own Translation System, October 22, 2007
- Barry Schwartz (2007-10-23). "Google Translate Drops SYSTRAN for Home-Brewed Translation". Search Engine Land.
- "Subjunctive Mood". Twitter. 15 May 2013.
- "Google Translate doesn't really understand 'tu' and 'vous'. Particularly "tu".". Reddit. 2 Dec 2013.
- Ethan Shen, Comparison of online machine translation tools, archived from the original on February 10, 2011, retrieved December 15, 2010
- Christopher Pecoraro, "Microsoft Bing Translator and Google Translate Compared for Italian to English Translation", irventu.com, retrieved April 8, 2012
- Christopher Pecoraro, "Microsoft Bing Translator and Google Translate Compared for Italian to English Translation (update)", irventu.com, retrieved April 8, 2012
- Statistical machine translation live, Franz Josef Och, Google Research Blog, April 28, 2006
- Henderson, Fergus (November 5, 2010). "Giving a voice to more languages on Google Translate". Google Blog. Retrieved December 22, 2011.
- "Five more languages on Google Translate". Google Translate Blog. May 13, 2010. Retrieved December 22, 2011.
- Jakob Uszkoreit, Ingeniarius Programmandi (September 30, 2010). "Veni, Vidi, Verba Verti". Google Blog. Retrieved December 22, 2011.
- SVOX[dead link]
- "Google Translate welcomes you to the Indic web". Google Translate Blog.
- Brants, Thorsten (September 13, 2012). "Translating Lao". Google Translate Blog. Retrieved September 19, 2012.
- Crum, Chris (September 13, 2012). "Google Adds its 65th Language to Google Translate with Lao". WebProNews. Retrieved September 19, 2012.
- "Translate Community: Help us improve Google Translate!".
- Nielsen, Michael. Reinventing discovery: the new era of networked science. Princeton, NJ: Princeton University Press. p. 125. ISBN 978-0-691-14890-8.
- Google Translate Tangles with Computer Learning Lee Gomes, Forbes Magazine, Aug 9, 2010
- Google Translates Ivan the Terrible as “Abraham Lincoln”[dead link] google.blognewschannel.com
- "Translate Community FAQ". Google.