Jump to content

Google Translate

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 192.12.88.224 (talk) at 21:27, 28 April 2014 (added Open Source Licenses and Components that Google uses for this tool). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Google Translate
File:Google Translate icon.png
File:Google Translate.PNG
Type of site
Machine translation
OwnerGoogle
URLtranslate.google.com
RegistrationOptional

Google Translate is a free, multilingual statistical machine-translation service provided by Google Inc. to translate written text from one language into another.

Before October 2007, for languages other than Arabic, Chinese and Russian, Google used a SYSTRAN based translator[1][2] which is used by other translation services such as Yahoo! Babel Fish, AOL, and Yahoo.

On May 26, 2011, Google announced that the Google Translate API had been deprecated and that it would cease functioning on December 1, 2011, "due to the substantial economic burden caused by extensive abuse."[3][4] The shutting down of the API, used by a number of websites, led to criticism of Google and to developers questioning the viability of using Google APIs in their products.[5][6]

On June 3, 2011, Google announced that they were cancelling the plan to terminate the Translate API due to public pressure. In the same announcement, Google said that it would release a paid version of the Translate API.[3][7]

Features and limitations

The service limits the number of paragraphs, or range of technical terms, that will be translated. It is possible to enter searches in a source language that are first translated to a destination language allowing one to browse and interpret results from the selected destination language in the source language.[8] For some languages, users are asked for alternate translations such as for technical terms, to be included for future updates to the translation process. Text in a foreign language can be typed, and if "Detect language" is selected, it will not only detect the language but will translate it into English by default.

The homepage of English Wikipedia translated into Portuguese

Google Translate, like other automatic translation tools, has its limitations. While it can help the reader to understand the general content of a foreign language text, it does not always deliver accurate translations. Some languages produce better results than others. Google Translate performs well especially when English is the target language and the source language is one of the languages of the European Union. Results of analyses were reported in 2010, showing that French to English translation is relatively accurate[9] and 2011 and 2012 showing that Italian to English translation is relatively accurate as well.[10][11] However, rule-based machine translations perform better if the text to be translated is shorter; this effect is particularly evident in Chinese to English translations. Edits of translations may be submitted, in Chinese specifically one is not able to edit a sentence as a whole, instead editing based off arbitrary sets of characters, leading to incorrect edits.[9]

Texts written in the Greek, Devanagari, Cyrillic and Arabic scripts can be transliterated automatically from phonetic equivalents written in the Latin alphabet. The browser version of the Google translator provides the read phonetically option for Japanese to English conversion. The same option is not available on the paid API version.

Accent of English that the "text-to-speech" audio of Google translate of each country uses
  British English (female)
  American English (female)
  Oceania accent (female)
  No Google translate service

Many of the more popular languages have a "text-to-speech" audio function that is able to read back a text in that language, up to a few dozen words or so. In the case of pluricentric languages, the accent depends on the region: for English, in the Americas, most of the Asia-Pacific and West Asia the audio uses a female General American accent, whereas in Europe, Hong Kong, Malaysia, Singapore, Guyana and all other parts of the world a female British English accent is used, except for a special Oceania accent used in Australia, New Zealand and Norfolk lsland; for Spanish, in the Americas a Latin American Spanish accent is used, while in the other parts of the world a Castilian Spanish accent is used; Portuguese uses a São Paulo accent in the world, except for Portugal, where their native accent is used. For other less popular languages, the audio is a garbled monotonous vaguely male low-quality voice.[citation needed]

Browser integration

A number of Firefox extensions exist for Google services, and likewise for Google Translate, which allow right-click command access to the translation service.[12]

An extension for Google's Chrome browser also exists;[13] in February 2010, Google Translate was integrated into the standard Google Chrome browser for automatic webpage translation.[14][15]

Android version

Google Translate is available as a free downloadable application for Android OS users. The first version was launched in January 2010. It works simply like the browser version. Google translation for Android contains two main options: "SMS translation" and "History".

An early 2011 version supported Conversation Mode when translating between English and Spanish (in alpha testing). This interface within Google Translate allows users to communicate fluidly with a nearby person in another language. In October 2011 it was expanded to 14 languages.[16]

The application supports 53 languages and voice input for 15 languages. It is available for devices running Android 2.1 and above and can be downloaded by searching for “Google Translate” in Google Play. It was first released in January 2010, with an improved version available on January 12, 2011.[17]

Latest version: 2.0.0 build 42.

iOS version

In August 2008, Google launched a Google Translate HTML5 web application for iOS for iPhone and iPod Touch users. The official iOS app for Google Translate was released February 8, 2011. It accepts voice input for 15 languages and allows translation of a word or phrase into one of more than 50 languages. Translations can be spoken out loud in 23 different languages.[18]

  • 1st stage
  • English to German
  • English to Spanish
     
  • French to English
  • German to English
  • Spanish to English
  • 2nd stage
  • Portuguese to English
  • Dutch to English
  • 3rd stage
  • 4th stage
  • English to Chinese (Simplified)
  • English to Japanese
  • English to Korean
     
  • Chinese (Simplified) to English
  • Japanese to English
  • Korean to English
  • 5th stage (launched April 2006)[19]
  • Arabic to English
  • 6th stage (launched December 2006)
  • English to Russian
  • Russian to English
  • 7th stage (launched February 2007)
  • English to Chinese (Traditional)
  • Chinese (Simplified to Traditional)
     
  • Chinese (Traditional) to English
  • Chinese (Traditional to Simplified)
  • 8th stage (launched October 2007)
    • all 25 language pairs use Google's machine translation system
  • 9th stage
  • Hindi to English
  • 10th stage (as of this stage, translation can be done between any two languages, using English as an intermediate step, if needed) (launched May 2008)
           
  • 11th stage (launched September 25, 2008)
                 
  • 12th stage (launched January 30, 2009)
               
  • 13th stage (launched June 19, 2009)
  • 14th stage (launched August 24, 2009)
                       
  • 15th stage (launched November 19, 2009)
    • The Beta stage is finished. Users can now choose to have the romanization written for Chinese, Japanese, Korean, Russian, Ukrainian, Belarusian, Bulgarian, Greek, Hindi and Thai. For translations from Arabic, Persian and Hindi, the user can enter a Latin transliteration of the text and the text will be transliterated to the native script for these languages as the user is typing. The text can now be read by a text-to-speech program in English, Italian, French and German
  • 16th stage (launched January 30, 2010)
  • 17th stage (launched April 2010)
    • Speech program launched in Hindi and Spanish
  • 18th stage (launched May 5, 2010)
    • Speech program launched in Afrikaans, Albanian, Catalan, Chinese (Mandarin), Croatian, Czech, Danish, Dutch, Finnish, Greek, Hungarian, Icelandic, Indonesian, Latvian, Macedonian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Swahili, Swedish, Turkish, Vietnamese and Welsh (based in eSpeak).[20]
  • 19th stage (launched May 13, 2010)[21]
               
  • 20th stage (launched June 2010)
  • Provides romanization for Arabic.
  • 21st stage (launched September 2010)
  • Allows phonetic typing for Arabic, Greek, Hindi, Persian, Russian, Serbian and Urdu.
  • Latin[22]
  • 22nd stage (launched December 2010)
    • Romanization of Arabic removed.
    • Spell check added.
    • Google replaced some languages' text-to-speech synthesizers from eSpeak's robot voice to native speaker's nature voice technologies made by SVOX[23] (Chinese, Czech, Danish, Dutch, Finnish, Greek, Hungarian, Norwegian, Polish, Portuguese, Russian, Swedish, Turkish). Also the old versions of French, German, Italian and Spanish. Latin uses the same synthesizer as Italian.
    • Speech program launched in Arabic, Japanese, and Korean.
  • 23rd stage (launched January 2011)
    • Choice of different translations for a word.
  • 25th stage (launched July 2011)
    • Translation rating introduced.
  • 26th stage (launched January 2012)
    • Dutch male voice synthesizer replaced with female.
    • Elena by SVOX replaced the Slovak eSpeak voice.
    • Transliteration of Yiddish added.
  • 27th stage (launched February 2012)
    • Speech program launched in Thai.
    • Esperanto added.
  • 28th stage (launched September 2012)
  • 29th stage (launched October 2012)
    • Transliteration of Lao added.

(Alpha status.)[25][26]

  • 30th stage (launched October 2012)
    • New speech program launched in English
  • 31st stage (launched November 2012)
    • New speech program in French, Spanish, Italian, and German
  • 32nd stage (launched March 2013)
    • Phrasebook added.
  • 33rd stage (launched April 2013)
  • 35th stage (launched May 2013)
    • 16 additional languages can be used with camera-input: Bulgarian, Catalan, Danish, Estonian, Finnish, Croatian, Hungarian, Indonesian, Icelandic, Lithuanian, Latvian, Norwegian, Romanian, Slovak, Slovenian, and Swedish.

Translation methodology

Google Translate does not apply grammatical rules, since its algorithms are based on statistical analysis rather than traditional rule-based analysis. Indeed, the system's original creator, Franz Josef Och, has criticized the effectiveness of rule-based algorithms in favor of statistical approaches.[27] It is based on a method called statistical machine translation, and more specifically, on research by Och who won the DARPA contest for speed machine translation in 2003. He is now the head of Google's machine translation group.[28]

Google does not translate from one language to another (L1 → L2), but often translates first to English and then to the target language (L1 → EN → L2).[29][30][31][32] However, because English, like all human languages, is ambiguous and depends on context, this can cause translation errors. For example, translating vous from French to Russian gives vous → you → ты OR Bы/вы.[33] If Google were using an unambiguous, artificial language as the intermediary, it would be vous → you → Bы/вы OR tu → thou → ты. Such a suffixing of words disambiguates their different meanings. Hence, publishing in English, using unambiguous words, providing context, using expressions such as "you all" often make a better one-step translation.

The following languages do not have a direct Google translation to or from English. These languages are translated through the indicated intermediate language (which in all cases is closely related to the desired language but more widely spoken) in addition to through English:[citation needed]

Overlooking the grammar of the language can cause mistakes. For example, consider the following sentence:
Пишет (3rd person: it writes) вам (dative: to you (all)) письмо (letter) семья (family) Дарьи (genitive: of Daria).
Based on the word order, Google translates: You wrote a letter to family Darya.[34]
Based on declensions (word functions), it means: [it's] Daria's family [that] writes you a letter, exactly the opposite.
Google took you for to you, Daria for of Daria as well as to the family for the family.
When translating back to Russian, however, Google says: Семья Дарьи пишет вам письмо.[35]
That's correct because Google understood the English word order.
Respecting the same word order as in English or publishing in English as above may help.

According to Och, a solid base for developing a usable statistical machine translation system for a new pair of languages from scratch would consist of a bilingual text corpus (or parallel collection) of more than a million words, and two monolingual corpora each of more than a billion words.[27] Statistical models from these data are then used to translate between those languages.

To acquire this huge amount of linguistic data, Google used United Nations documents.[36] The UN typically publishes documents in all six official UN languages, which has produced a very large 6-language corpus.

Google representatives have been involved with domestic conferences in Japan where Google has solicited bilingual data from researchers.[37]

When Google Translate generates a translation, it looks for patterns in hundreds of millions of documents to help decide on the best translation. By detecting patterns in documents that have already been translated by human translators, Google Translate makes intelligent guesses (AI) as to what an appropriate translation should be.[38]

Open Source Licenses and Components

Language Wordnet Licence
Albanian Albanet CC-BY 3.0/GPL 3
Arabic Arabic Wordnet CC-BY-SA 3
Chinese Chinese Wordnet Wordnet
Danish Dannet Wordnet
English Princeton Wordnet Wordnet
Farsi/Persian Persian Wordnet Free to Use
Finnish FinnWordnet Wordnet
French WOLF (WOrdnet Libre du Francias) CeCILL-C
Hebrew Hebrew Wordnet Wordnet
Italian MultiWordnet CC-BY-3.0
Japanese Japanese Wordnet Wordnet
Catalan Multilingual Central Repository CC-BY-3.0
Galilean Multilingual Central Repository CC-BY-3.0
Spanish Multilingual Central Repository CC-BY-3.0
Indonesian Wordnet Bahasa MIT
Malaysian Wordnet Bahasa MIT
Norwegian Norwegian Wordnet Wordnet
Polish plWordnet Wordnet
Portuguese OpeanWN-PT CC-BY-SA-3.0
Thai Thai Wordnet Wordnet

[39]

Reviews

Shortly after launching the translation service, Google won an international competition for English–Arabic and English–Chinese machine translation.[40]

Translation mistakes and oddities

Because Google Translate uses statistical matching to translate rather than a dictionary/grammar rules approach, translated text can often include apparently nonsensical and obvious errors,[41] often swapping common terms for similar but nonequivalent common terms in the other language,[42] as well as inverting sentence meaning.[43][citation needed] Also, for the speech, it uses only European French as well as Latin American Spanish worldwide, but both European and Brazilian Portuguese (European for translate.google.pt and Brazilian for all other Google Translate sites).

Controversies

Google has been accused of sexism due to the statistical assignment of gender when translating from or through English into languages where verbs are conjugated by gender. For example, the phrase I drive used to be translated into a masculine conjugation, while I cook into a feminine conjugation, due to the higher occurrence of such forms in corpora. Due to public criticism in Israel, Google has manually fixed some apparent cases of sexist translation into Hebrew by using the masculine form for all verbs.[citation needed]

See also

References

  1. ^ Google Switches to Its Own Translation System, October 22, 2007
  2. ^ Google Translate Drops Systran For Home Brewed Translation 23/12/2007, Barry Schwartz, searchengineland.com
  3. ^ a b Feldman, Adam (May 26, 2011). "Spring cleaning for some of our APIs". Google Code. Retrieved May 28, 2011. Cite error: The named reference "Feldman" was defined multiple times with different content (see the help page).
  4. ^ "Google Translate API (Deprecated)". Google Code. 2011. Archived from the original on May 28, 2011. Retrieved May 28, 2011.
  5. ^ Wong, George (May 27, 2011). "Google gets rid of APIs for Translate and other services". ubergizmo. Retrieved May 28, 2011.
  6. ^ Burnette, Ed (May 27, 2011). "Google pulls the rug out from under web service API developers, nixes Google Translate and 17 others". ZDNet. Retrieved May 28, 2011.
  7. ^ "Google cancels plan to shutdown Translate API. To start charging for translations". June 4, 2011. Retrieved June 4, 2011.
  8. ^ "Google Translate". Google. Retrieved January 24, 2009.
  9. ^ a b Ethan Shen (2010), "Comparison of online machine translation tools", www.tcworld.info, archived from the original on February 10, 2011, retrieved December 15, 2011
  10. ^ Christopher Pecoraro (2011), "Microsoft Bing Translator and Google Translate Compared for Italian to English Translation", irventu.com, retrieved April 8, 2012
  11. ^ Christopher Pecoraro (2012), "Microsoft Bing Translator and Google Translate Compared for Italian to English Translation (update)", irventu.com, retrieved April 8, 2012
  12. ^ "Search Add-ons :: Add-ons for Firefox". Mozilla. Retrieved August 7, 2009.
  13. ^ Google Translate by chrome.translate.extension chrome.google.com
  14. ^ "Google Translate Integrated In Google Chrome 5". Ghacks.net. February 14, 2010. Retrieved December 22, 2011.
  15. ^ Google Chrome 5 features an integrated Google Translate service 15/2/2010 , stuff.techwhack.com
  16. ^ Gigaom.com 2011 October 13 by Ryan Kim. Google Translate conversation mode expands to 14 languages
  17. ^ A new look for Google Translate for Android, Awaneesh Verma, Google Translate blog, January 12, 2011
  18. ^ Introducing the Google Translate app for iPhone, Wenzhang Zhu, Google Translate blog, February 8, 2011
  19. ^ Statistical machine translation live, Franz Josef Och, Google Research blog, April 28, 2006
  20. ^ Henderson, Fergus (November 5, 2010). "Official Google Blog: Giving a voice to more languages on Google Translate". Googleblog.blogspot.com. Retrieved December 22, 2011.
  21. ^ "Five more languages on translate.google.com – Google Translate Blog". Googletranslate.blogspot.com. May 13, 2010. Retrieved December 22, 2011.
  22. ^ Jakob Uszkoreit, Ingeniarius Programmandi (September 30, 2010). "Official Google Blog: Veni, Vidi, Verba Verti". Googleblog.blogspot.com. Retrieved December 22, 2011.
  23. ^ [1][dead link]
  24. ^ Google Translate Blog: Google Translate welcomes you to the Indic web
  25. ^ Brants, Thorsten (September 13, 2012). "Translating Lao". Google Translate blog. Retrieved September 19, 2012.
  26. ^ Crum, Chris (September 13, 2012). "Google Adds Its 65th Language To Google Translate With Lao". WebProNews. Retrieved September 19, 2012.
  27. ^ a b Och, Franz Josef (September 12, 2005), "Statistical Machine Translation: Foundations and Recent Advances" (PDF), The Tenth Machine Translation Summit, Phuket, Thailand, retrieved December 19, 2010 {{citation}}: |format= requires |url= (help)CS1 maint: location missing publisher (link)
  28. ^ "Franz Josef Och". Google. Retrieved December 19, 2010. Franz Josef Och joined Google in 2004 as a research scientist, where he leads the machine translation group.
  29. ^ French to Russian translation translates the untranslated non-French word "obvious" from pivot (intermediate) English to Russian le mot 'obvious' n'est pas français → «очевидными» слово не французское
  30. ^ We pretend that this English article is German when asking Google to translate it to French. Google, because it does not find the English words in the German dictionary, leaves those words unchanged as one can show it with this spelllling misssstake. But it translates them to French nonetheless. That's because Google translates German → English → French and that the unchanged English words undergo the second translation. The word "außergewöhnlich" however will be translated twice.
  31. ^ "Google Translate performs two-step translation through English" (PDF). Retrieved December 22, 2011.
  32. ^ a b Wrong translation to Ukrainian language because going through both Russian and English
  33. ^ Google Translation mixes up "tu" and plural or polite "vous" Je vous aime. Tu es ici. You are here. → Я люблю тебя. Вы здесь. Вы здесь.
  34. ^ The meaning of the English translation is the inverse of the Russian sentence ... Пишет вам письмо семья Дарьи → You wrote a letter to family Darya
  35. ^ ... but the English to Russian translation is correct Daria's family writes you a letter → Семья Дарьи пишет вам письмо
  36. ^ Google seeks world of instant translations (Reuters)
  37. ^ Google was an official sponsor of the annual Computational Linguistics in Japan Conference ("Gengoshorigakkai") in 2007. Google also sent a delegate to the meeting of the members of the Computational Linguistic Society of Japan in March 2005, promising funding to researchers who would be willing to share text data.
  38. ^ [2] Find out how our translations are created
  39. ^ https://translate.google.com/about/intl/en_ALL/
  40. ^ Nielsen, Michael. Reinventing discovery : the new era of networked science. Princeton, N.J.: Princeton University Press. p. 125. ISBN 978-0-691-14890-8.
  41. ^ Google Translate Tangles With Computer Learning Lee Gomes , Forbes Magazine , 9/8/2010
  42. ^ Google Translates Ivan the Terrible as “Abraham Lincoln” google.blognewschannel.com
  43. ^ Translation Laura Mestre