Computer-assisted translation

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Computer-assisted translation, computer-aided translation, or CAT is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process.

Computer-assisted translation is sometimes called machine-assisted, or machine-aided, translation.

Contents

[edit] Computer-assisted translation and machine translation

Some advanced computer-assisted translation solutions include controlled machine translation (MT). This type of technology is widely known amongst professional translators and terminologists and also available to any individual translators who wish to invest in such technology. Higher priced MT modules generally provide a more complex set of tools available to the translator, which may include terminology management features and various other linguistic tools and utilities. Carefully customized user dictionaries based on correct terminology significantly improve the accuracy of MT, and as a result, aim at increasing the efficiency of the entire translation process.

[edit] Overview

Computer-assisted translation is a broad and imprecise term covering a range of tools, from the fairly simple to the more complicated. These can include:

[edit] Translation memory software

Translation memory (TM) programs store previously translated source texts and their equivalent target texts in a database and retrieve related segments during the translation of new texts.

Such programs split the source text into manageable units known as "segments". A source-text sentence or sentence-like unit (headings, titles or elements in a list) may be considered a segment, or texts may be segmented into larger units such as paragraphs or small ones, such as clauses. As the translator works through a document, the software displays each source segment in turn and provides a previous translation for re-use, if the program finds a matching source segment in its database. If it does not, the program allows the translator to enter a translation for the new segment. After the translation for a segment is completed, the program stores the new translation and moves onto the next segment. In the dominant paradigm, the translation memory, in principle, is a simple database of fields containing the source language segment, the translation of the segment, and other information such as segment creation date, last access, translator name, and so on. Another translation memory approach does not involve the creation of a database, relying on aligned reference documents instead (e.g. SDL Trados).

Some translation memory programs function as standalone environments, while others function as an add-on or macro to commercially available word-processing or other business software programs. Add-on programs allow source documents from other formats, such as desktop publishing files, spreadsheets, or HTML code, to be handled using the TM program.

[edit] Language Search Engine Software

New to the translation industry, Language Search Engine software is typically an Internet based system that works similarly to Internet search engines. Rather than searching the Internet, however, a language search engine searches a large repository of Translation Memories to find previously translated sentence fragments, phrases, whole sentences, even complete paragraphs that match source document segments.

Language search engines are designed to leverage modern search technology to conduct searches based on the source words in context to ensure that the search results match the meaning of the source segments. Like traditional TM tools, the value of a language search engine rests heavily on the Translation Memory repository it searches against.

[edit] Terminology management software

Terminology management software provides the translator a means of automatically searching a given terminology database for terms appearing in a document, either by automatically displaying terms in the translation memory software interface window or through the use of hot keys to view the entry in the terminology database. Some programs have other hotkey combinations allowing the translator to add new terminology pairs to the terminology database on the fly during translation. Some of the more advanced systems enable translators to check, either interactively or in batch mode, if the correct source/target term combination has been used within and across the translation memory segments in a given project.

[edit] Alignment software

Alignment programs take completed translations, divide both source and target texts into segments, and attempt to determine which segments belong together in order to build a translation memory database with the content. Many alignment programs allow translators to manually realign mismatched segments. The resulting translation memory file can then be imported into a translation memory program for future translations.

[edit] Comparison of different CAT tools

(Alphabetical order, free software first, proprietary solutions second.)

Tool Supported File Formats OS Price License
Anaphraseus ODT, all OpenOffice Writer formats (DOC, TXT etc.) Cross-platform (StarBasic macro) GPL
Attesoro Java properties Cross-platform (Java) GPL
gtranslator PO POSIX GPL
Okapi Framework PO, Windows RC, TMX, Wordfast, Trados, Java Properties, Regular-expression-based text, Illustrator, INX, ResX, Table-type files, XML Cross-platform (Java) GPL
OmegaT+ DocBook, DokuWiki, JavaHelp, Java Properties, OpenDocument (ODF), OpenOffice, Office Open XML, HTML, Help And Manual, HTML Help Compiler (HCC), INI, Mozilla DTD, PO, ResX, StarOffice, Text, Typo3, Windows RC, WiX, XHTML, XLIFF Cross-platform (Java) GPL

OmegaT

HTML, XHTML, DocBook, Plain Text, PO, JavaHelp, Java Resource Bundles, OpenDocument (ODF), OpenOffice, StarOffice, Office Open XML, HTML Help Compiler (HCC), INI files Cross-platform (Java) GPL
BEYTrans HTML, XHTML, Plain Text, PO, PHP Array, Java Resource Bundles, HTML Help Compiler (HCC) Cross-platform (Java) GPL - Free open crowdsourcing translation; Online non-commercial translation

openTMS

HTML, XHTML, DocBook, Plain Text, OpenOffice, Office Open XML Cross-platform (Java) EPL
Open Language Tools HTML/XHTML, XML, DocBook SGML, ASCII, StarOffice/OpenOffice/ODF, .po (gettext), .properties, .java (ResourceBundle), .msg/.tmsg (catgets) Cross-platform (Java) CDDL
Poedit Gettext PO Cross-platform MIT license
Pootle Gettext PO, XLIFF, OpenOffice GSI files (.sdf), TMX, TBX, Java Properties, DTD, CSV, HTML, XHTML, Plain Text Cross-platform (Python) GPL
Transolution HTML, StarOffice/OpenOffice,
XLIFF, DOCBOOK
Cross-platform (Python) GPL
Virtaal XLIFF, Gettext PO and MO, TMX, TBX, Wordfast TM, Qt .ts.
Many others via converters in the Translate Toolkit
Cross-platform (Python) GPL
Proprietary solutions
ABBYY Aligner DOC, DOCX, TXT, RTF, PDF, HTML, PPT, PPTX, PPS, PPSX, XLS, XLSX Windows Contact for pricing Proprietary
Alchemy CATALYST 9x, NT, 2000, XP, Win32, Winx64, VISTA, RC, RESX, .NET Binaries(1.x and 2.0), Visual Basic.NET, Microsoft WPF, Java EE, Java SE, Java ME, JAR, .properties, WAS, EAR, HTML (and all derivatives PHP, ASP, JSP), XHTML, XML (including derivative ASP.NET, ASP, JSP and XSL), MS Excel, DITA 1.0, Databases Windows eStore Proprietary
AppleTrans HTML, RTF, Plain Text (TXT) Macintosh, Mac program Free Proprietary
across MS Word, MS Excel, MS PowerPoint, HTML, XML, RTF, Plain Text (TXT), EXE, RC, DLL, QuarkXPress, Adobe FrameMaker (MIF), Adobe InDesign, MSI, INI, OCX, SCR, CPL, NLS, RESX Windows, web application Freelance version: free, corporate licenses on inquiry Proprietary
AidTransStudio OpenOffice,MS Excel, MS PowerPoint, MS Word, MS Word Xml, HTML, ASP, PHP, ASPX, Plain Text, XML, Trados TTX, TMX, Custom Format (config based on Regular Expressions) Windows (.NET) Basic Edition: Free, Pro and Ent See price list Proprietary
AnyMem 2.0 MS Word, TMX export-import Windows 89 Proprietary
Araya HTML, XML, plain text, RTF, TMX, XLIFF Cross-platform (Java) €400 / Server €6500 Proprietary
CafeTran Plain Text, HTML, XML, OpenOffice, MS Office, AbiWord, Kword, TMX, Trados TTX, XLIFF, Adobe InDesign (INX and IDML), Adobe FrameMaker (MIF), AutoCad (DXF), iWork, Java Properties, Windows .NET Resources (ResX), Mac OSX/iOS String Resources Cross-platform Windows - Mac OS - Linux (Java) €80 Proprietary
CatsCradle HTML, CSV, Help contents and index files (.hhc, .hhk) Windows €60  ?
Déjà Vu (DVX) XML, Plain Text, OpenOffice, Adobe FrameMaker, Adobe PageMaker, ASP, Interleaf/Quicksilver, InDesign, Help Content, SGML, MS Access, MS Excel, MS PowerPoint, MS Word, QuarkXPress, RTF, Resource files, C/C++/Java source files, Java Properties, JavaScript, VBScript, GNU gettext Windows Standard: €490, Pro: €990, Workgroup: €1490 Proprietary
Felix MS Word, MS Excel, MS PowerPoint (for Windows); HTML Windows $US 350 Proprietary
Fluency MS Word (.doc, .docx), MS Excel (.xls, .xlsx), MS PowerPoint (.pptx,.ppsx,.ppt.pps), TMX, Adobe PDF, Adobe InDesign, HTML, XHTML, XML, RTF, Trados Bilingual Documents, Trados TTX, Plain Text Windows See price list. Proprietary
Fusion MS Word, MS Excel, MS PowerPoint (for Windows); RTF, XML, HTML - Note: Native Office Support which means Translation conducted within the Office environments (Word, Excel, PowerPoint) + Trados and Wordfast Tagged formats Windows Proprietary
GlobalSight Text ANSI / ASCII / Unicode for Windows, Text for Apple Macintosh, HTML, XML (ASP.NET, ASP, JSP, XSL), SGML, SVG (Scalable Vector Graphics), MS Word for Windows, MS Excel, MS PowerPoint, RTF, RC, QuarkXPress, Adobe FrameMaker, Adobe PageMaker, Interleaf /Quicksilver, Adobe InDesign Cross-platform (Java) Apache License 2.0
Google Translator Toolkit HTML, Microsoft Word, OpenDocument Text, Text, Rich Text, SubRip, SubViewer, Wikipedia Web application Free Proprietary
Heartsome Translation Suite HTML/XHTML, XML, Plain Text, OpenOffice, StarOffice, AbiWord, PO/POT (GNU Gettext), SVG, Adobe FrameMaker (MIF), Adobe InDesign, DocBook, DITA, Java Properties, JavaScript, RTF, Tagged RTF, Trados TTX, MS Office 2003 XML, ResX (Windows .NET Resources), RC (Windows C/C++ Resources), MS Office 2007 (beta) Cross-platform (Java) See price list. Proprietary
Lingotek Translation Platform HTML, XHTML, XML, MS Word, MS Excel, MS Powerpoint, OpenOffice/StarOffice, OpenDocument (ODF), OpenDocument Text (.odt), OpenDocument Spreadsheet (.ods), OpenDocument Presentation (.odp), Microsoft Resource (.rc), Rich Text Format (.rtf), Plain Text (.txt), Java Properties (.properties), Gettext PO, TMX, XLIFF Web application Contact for Pricing Proprietary
Lingo HTML, XML, Adobe Framemaker, Adobe PDF (.pdf), .net, MS Word (.doc), MS Ecma (.xps), TMX (.tmx) Windows 2000 SP4, XP SP2, Vista; .NET Framework 2.0, Internet Explorer 5.5, SQL Server 2005 US $ 899 Proprietary
LocaleMinder.com Java Properties (.properties), XML, TMX, XLIFF, W3C ITS Web application Based on number of words for developers. Free for translators. Proprietary/Apache License 2.0 interaction library
LogiTermWeb Word, Excel, PowerPoint, WordPerfect, PDF, HTML, RTF, plain text Windows Proprietary
memoQ HTML, XML - with real-time preview, plain text, MS Office 2000/XP/2003/2007 formats (doc, xls, ppt, docx, pptx - all with real-time preview), RTF, bilingual RTF (Trados compatible), RTF multi-column table (both export and import), full XLIFF support (both creation and import/export), Star Transit XV and NXT packages, Trados TTX, Adobe FrameMaker, Adobe Indesign, proprietary bilingual format (MBD), TMX, CSV, TSV, SRX, PDF (exported as plain text). Native support for many XML-based formats (AuthorIT, DITA, Freemind mindmaps, Excel 2003 XML, Java .properties, Microsoft Help files (.hhk, .hhc), ResX, Typo3). Windows 4Free: Freeware
translator standard: €99
translator pro: €620
serverFive: ask
Proprietary
MEMOrg HTML, XHTML, XML, MS Word, MS Excel, RTF Windows, Web Application Access by invitation only Proprietary
MetaTexis HTML, XML, Resource files
MS Word (all kinds of text files that can be imported by MS Word), MS Excel, MS PowerPoint, Adobe FrameMaker, Adobe PageMaker, QuarkXPress
Windows Lite: for free
Pro: €98
NET/Office: €138
Proprietary
MultiCorpora MultiTrans HTML, XML, MS Word, MS Excel, MS Powerpoint, WordPerfect, QuarkXPress, Adobe Maker Interchange Format (MIF), Adobe InDesign Windows Proprietary
Fortis Revolution Translation Suite TMX, HTML, XML, Word, Framemaker, Pagemaker, InDesign, Resource files, Help files, text files, source code, and more. Customized filters are available upon request. Windows $US 199
ppp.helper MS PowerPoint Windows €39 Proprietary
Rainbow HTML, XHTML, Scripts,
Photoshop, etc.
Windows (.NET) Freeware Proprietary
SDL Trados Features four translation environments: dedicated TagEditor, MSWord Interface, SDLX, and - as the latest development - the new SDL Trados Studio 2009. Additional filters for translating with TagEditor available: Word, Excel, PowerPoint, OpenOffice, InDesign, QuarkXPress, PageMaker, Interleaf, Framemaker, HTML, SGML, XML, SVG, .... Includes SDL MultiTerm for terminology management and Project Management Dashboard for automating tasks and tracking. Windows €795 (freelance) - €4995 (LSP; floating license) Proprietary
SecureCottage Reference Portal HTML, XHTML, PHP, CGI, most web pages Web based Free for use Proprietary
SEER English Spanish Translator MS Office Word, Excel, PowerPoint, Plain Text, HTML Windows $US 299 Proprietary
Similis HTML,XML, PDF,MS Word, OpenOffice, Trados Windows Rental - 360 € per year Proprietary
STAR Transit Text ANSI / ASCII / Unicode for Windows, Text for Apple Macintosh, Corel WordPerfect, HTML, XML (ASP.NET, ASP, JSP, XSL), SGML, SVG (Scalable Vector Graphics), MS Word for Windows, MS Excel, MS PowerPoint, RTF y RTF for WinHelp, RC, QuarkXPress, Adobe FrameMaker, Adobe PageMaker, Interleaf /Quicksilver, Adobe InDesign, XGate para QuarkXPress, AutoCAD Windows Proprietary
Swordfish XLIFF, HTML/XHTML, XML, Plain Text, OpenOffice, StarOffice, AbiWord, PO/POT (GNU Gettext), SVG, Adobe FrameMaker (MIF), Adobe InDesign (INX and IDML), DocBook, DITA, Java Properties, JavaScript, RTF, Trados Tagged RTF, Trados TTX, MS Office 2003 XML, ResX (Windows .NET Resources), RC (Windows C/C++ Resources), MS Office 2007, MS Visio, SDLXLIFF (Trados 2009) Cross-platform Windows - Linux - Mac OS X (Java) €240 Proprietary
Sysfilter (ECM engineering)
Sysfilter enables you to transfer texts from your files to a word processor of your choice or into XML. After the translation you can use the filter to automatically reinsert the texts into the original document. Supported formats: Adobe Illustrator, Indesign, Photoshop, MS Visio, MS Excel, CorelDraw Windows € 69 - € 349 Proprietary
Termbases  ? Web application Free Proprietary
Tr-aid  ?  ?  ?  ?
TranslateCAD It Extracts text from AutoCAD DXF drawings (Version 2000-2008) and prepares files to work with any CAT tool, and re-joins files to generate a target DXF translated file. Windows $US 29.95 Proprietary
TransSearch Any text copied on the web form Web application $CA 129.95 Proprietary
Translation Search Engine (by Elanex) Any format that can be copied and pasted onto a web page, TMX Web application Free Web Access Proprietary
Web Translate It Gettext .po/.pot, Ruby .yml (Ruby on Rails), XLIFF, Java .properties/.xml, Qt .ts, JSON, Apple .strings (Macintosh, iPhone and iPad), Android .xml, BlackBerry .rrc, HTML, XHTML, Microsoft .resx, plain text, Markdown, Textile, PHP .ini. Web application 0 - €149 monthly Proprietary
Wordfast Classic MS Word, Excel, PowerPoint (for Windows and Mac); tagged documents Microsoft Office Word addin €330/165 (for both Wordfast Classic and Wordfast Pro bundle) Proprietary
Wordfast Pro MS Word, Excel, PowerPoint, html, xml, asp, jsp, InDesign Cross-platform Windows, Mac, Linux (Java) €330/165 (for both Wordfast Classic and Wordfast Pro bundle) Proprietary
WordFisher MS Word WordBasic\Ms Office Word macro Free Licence Proprietary
XTM Enterprise Suite XLIFF, HTML/XHTML, XML, Plain Text, OpenOffice, StarOffice, PO/POT (GNU Gettext), SVG, Adobe FrameMaker (MIF), Adobe InDesign (INX), DocBook, DITA, Java Properties, JavaScript, RTF, Trados Tagged RTF, Trados TTX, MS Office 2003 XML, ResX (Windows .NET Resources), RC (Windows C/C++ Resources), MS Office 2007, built completely on Open Standards: TMX, TBX, XLIFF, SRX, xml:tm, W3C ITS Web application
TM-database Adobe InDesign (*.inx), Plain Text (*.txt), PHP source(*.txt), Excel (*.xls), TMX (*.tmx). Windows Free Proprietary

According to a survey done by the Imperial College, in 2006 the most popular systems were:[1]

CAT tools changed considerably after the survey was published and its results should be analyzed with care.[citation needed]

As of July 2010, videos of 20+ translation environment tools translating the same Word document are available for free on TranslatorsTraining.com.

[edit] See also

[edit] External links

  1. ^ See page 26 in "Imperial College London Translation Memories Survey". 2006. http://www3.imperial.ac.uk/portal/pls/portallive/docs/1/7307707.PDF.  (PDF). (Permalink)
Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages