From Wikipedia, the free encyclopedia

ELMo ("Embeddings from Language Model") is a word embedding method for representing a sequence of words as a corresponding sequence of vectors.[1] Character-level tokens are taken as the inputs to a bi-directional LSTM which produces word-level embeddings. Like BERT (but unlike the word embeddings produced by "Bag of Words" approaches, and earlier vector approaches such as Word2Vec and GloVe), ELMo embeddings are context-sensitive, producing different representations for words that share the same spelling but have different meanings (homonyms) such as "bank" in "river bank" and "bank balance".

It was created by researchers at the Allen Institute for Artificial Intelligence[2] and University of Washington.


  1. ^ Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018). "Deep contextualized word representations". arXiv:1802.05365 [cs.CL].
  2. ^ "AllenNLP - ELMo — Allen Institute for AI".