|Industry||Translation / Portals|
|Founder||Gregory Binger, Dion Wiggins, Bob Hayward, Philipp Koehn|
Number of locations
|Singapore, Thailand, Los Angeles, Indonesia|
|Gregory Binger, Dion Wiggins, Bob Hayward, Philipp Koehn, Tim Cox.|
|Products||Language Studio Machine Translation and Language Processing Platform|
|Services||Automated translation, custom machine translation engines, language processing|
Asia Online is a privately owned automated translation company backed by individual investors and institutional venture capital. Its corporate headquarters are in Singapore, with significant operations in Bangkok, Thailand, R&D activities throughout Asia, and sales operations in Europe and North America. The firm was founded in 2007 by the University of Edinburgh's Philipp Koehn, Gregory Binger a technologist and IT/IP lawyer, and former Gartner senior analysts Bob Hayward and Dion Wiggins.
The firm is undertaking what it calls the world's largest literacy project by translating vast quantities of the worlds English language knowledge into Asian languages. This is achieved using statistical machine translation (SMT) technologies developed and enhanced in Thailand with a specific focus on Asian languages. Despite the name, Asia Online is not limited to just Asian languages and also supports all 23 official EU languages across each other.
The firm's statistically based translation software employ recent advances in automated translation. Until the early 1990s, almost all production-level machine translation technology relied on collections of linguistic rules to analyze the source sentence, and then map the syntactic and semantic structure into the target language. Its current approach uses statistical techniques from cryptography, applying machine learning algorithms that automatically acquire statistical models from existing parallel collections of human translations, in the same way as Google Translate and the systems made using Koehn's own open source Moses tool for SMT.
On January 7, 2011, Asia Online launched its Thai language consumer portal, funded in part by CAT Telecom and the Thai Ministry of ICT. All 3.6 million English language Wikipedia articles were translated from English into Thai. Then Prime Minister Abhisit Vejjajiva and Minister of ICT Chuti Krairiksh launched the site as part of Thailand’s Children’s Day celebrations. A crowd sourcing approach is being taken to proofread the articles after they have been machine translated.
Differences from other approaches
- Clean data: The traditional approach leveraged content found on the web in corporate sites, news articles and other similar sources where the same content was available in multiple languages: this gives low-quality data. Asia Online has focused machine and human resources in this area to ensure that the data is as clean and as accurate as possible. The company's data is sourced from high-quality translations provided by book publishers and translation companies, and is aligned at the segment level (usually sentences) and converted into a consistent format in order to be processed by the learning software. This step includes extracting segments from files and documents if they are not in a TMX format. Then the extracted sequence are aligned—and processed by machines, with humans used to validate the accuracy.The data is converted to a base UTF-8 encoding for training the SMT system, small subsets are extracted to guide training, and finally the data is reviewed, cleaned, and analyzed.
- Multiple domains: the system allows for training in many domains, by extending a base set of information with multiple additional learning sources.
- Real-time corrections
The firm currently has more than 530 language pairs available in a baseline form and is progressively deploying 15 domains across each language pair. Another 200+ language pairs are under development. Currently supported languages are the Asian languages Arabic, Chinese, Hindi, Japanese, Bahasa Indonesian, Bahasa Malay, Korean, and Thai; and the European languages Bulgarian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian, Maltese Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene. Spanish, Swedish, Russian, and Ukrainian. The additional Asian languages Bengali, Gujarati, Punjabi, Tagalog, Tamil, Urdu, and Vietnamese are under development .
Their systems are currently used to build customized translation systems for corporate and language service provider (LSP) customers who add their bilingual parallel corpus to the existing data to create higher quality translation systems.
The company characterizes its products as a "platform", a suite of independent tools and products that can work independently and together. Some are locally installed and some are only available in their SaaS. This is described in the CSA blog entry.
The Language Studio product suite was reviewed by Common Sense Advisory, a translation industry market research firm, in their Global Watchtower blog shown in the link below.
- Asia Online Portal Homepage
- Asia Online Company Homepage
- Language Studio Platform Overview
- Thai Prime Minister Abhisit Vejjajiva and ICT Minister Chuti Krairoek launch the Thai language Asia Online portal in front of the Thai media as a gift to the children from the government on Children’s Day.
- CSA Global Watchtower Blog entry on Language Studio Platform
- CSA Global Watchtower Blog entry - The Largest Translation Project…So Far
- TAUS Technology Review of Language Studio
- dotSUB: CEO Dion Wiggins on Asia Online Vision from Bangkok Localization & Translation Conference, December 2009
- dotSUB: Renato Beninatto, CEO, Milengo and former industry analyst looks at the ongoing evolution of the localization business and compares Google translate with Asia Online.
- GizMag Article on Asia Online