Philipp Koehn

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Philipp Koehn
Born (1971-08-01) August 1, 1971 (age 43)
Erlangen, Bavaria, Germany
Residence United Kingdom
Citizenship Germany
Fields computer science, natural language processing, machine translation
Institutions University of Edinburgh, Johns Hopkins University
Alma mater University of Erlangen-Nuremberg, University of Tennessee, University of Southern California
Doctoral advisor Kevin Knight
Known for Asia Online, Europarl Corpus, Moses
Notable awards Finalist - 2013 EPO European Inventor Award

Philipp Koehn (born August 1, 1971 in Erlangen, Germany) is a computer scientist and researcher in the field of machine translation.[1][2] His primary research interest is statistical machine translation and he is one of the inventors of a method called phrase based machine translation which is a sub-field of statistical translation methods that employs sequences of words (or so-called "phrases") as the basis of translation, expanding the previous word based approaches.  A 2003 paper which he authored with Franz Josef Och and Daniel Marcu called Statistical phrase-based translation has attracted wide attention in Machine translation community and has been cited over a thousand times.[3] Phrase based methods are widely used in machine translation applications in industry.  An example of such systems are Google Translate and Asia Online.

Philipp Koehn is married to Trishann Koehn, and has two children, Phianna and Leo.

Moses Statistical Machine Translation Decoder[edit]

The Moses machine translation decoder is an open source project that was created by and is maintained under the guidance of Philipp Koehn.[4] The Moses decoder is a platform for developing Statistical machine translation systems given a parallel corpus for any language pair.[5] The decoder was mainly developed by Hieu Hoang and Philipp Koehn at the University of Edinburgh and extended during a Johns Hopkins University Summer Workshop and further developed under EuroMatrix and GALE project funding.  The decoder (which is part of a complete statistical machine translation toolkit) is the de facto benchmark for research in the field.

Although Koehn continues to play a major role in the development of Moses, the Moses decoder was supported by the European Framework 6 projects EuroMatrix, TC-Star, the European Framework 7 projects EuroMatrixPlus, Let’s MT, META-NET and MosesCore and the DARPA GALE project, as well as several universities such as the University of Edinburgh, the University of Maryland, ITC-irst, Massachusetts Institute of Technology, and others.  Substantial additional contributors to the Moses decoder include Hieu Hoang, Chris Dyer, Josh Schroeder, Marcello Federico, Richard Zens, and Wade Shen.

Europarl Corpus[edit]

The Europarl Corpus is a set of documents that consists of the proceedings of the European Parliament from 1996 to the present.  The corpus has been compiled and expanded by a group of researchers led by Philipp Koehn at University of Edinburgh.  The data that makes up the corpus was extracted from the website of the European Parliament and then prepared for linguistic research.  The latest release (2012) comprised up to 60 million words per language,[6] with 21 European languages represented: Romanic (French, Italian, Spanish, Portuguese, Romanian), Germanic (English, Dutch, German, Danish, Swedish), Slavic (Bulgarian, Czech, Polish, Slovak, Slovene), Finno-Ugric (Finnish, Hungarian, Estonian), Baltic (Latvian, Lithuanian), and Greek.

Other Interests and Activities In Chronological Order[edit]

  • Koehn is a professor and Chair of Machine Translation at the University of Edinburgh's School of Informatics and contributes to its Statistical Machine Translation Group which organizes workshops, seminars and project related to the subject.[7]
  • Koehn has been a consultant to SYSTRAN since 2006.[8]  SYSTRAN is a public company founded in 1968 and specializes in machine translation software[9] with a market capitalization of 15.0 million euros and software publishing business accounting for revenues of 6.1 million euros in 2011.[10]
  • Koehn is also acting as Chief Scientist for Asia Online[11] and a shareholder in Asia Online since 2007.  Asia Online is a private company developing and commercializing machine translation technologies[12] with undisclosed valuation and revenues.
  • Koehn authored a book titled Statistical Machine Translation in 2009.[13][14][15]

Awards and Recognition[edit]

  • 2013: One of three finalists in the category of Research for the European Patent Office (EPO) 2013 European Inventor Award.[16] Koehn was recognized for patent EP 1488338 B, Phrase-Based Joint Probability Model for Statistical Machine Translations, a translation model that uses mathematical probabilities to determine the most likely interpretation of chunks of text between foreign languages.

References[edit]