Jump to content

Lancaster-Oslo-Bergen Corpus: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
linking
add needed source
 
Line 44: Line 44:
|}
|}


The corpus has been also [[Part-of-speech tagging|tagged]], i.e. [[part-of-speech]] categories have been assigned to every word.{{fact|date=April 2019}}
The corpus has been also [[Part-of-speech tagging|tagged]], i.e. [[part-of-speech]] categories have been assigned to every word.<ref>{{cite web |last1=Johansson |first1=Stig |title=CoRD {{!}} The Lancaster-Oslo/Bergen Corpus (LOB) |url=https://varieng.helsinki.fi/CoRD/corpora/LOB/index.html |website=varieng.helsinki.fi}}</ref>


==References==
==References==

Latest revision as of 20:05, 14 June 2024

The Lancaster-Oslo/Bergen (LOB) Corpus is a one-million-word collection of British English texts which was compiled in the 1970s in collaboration between the University of Lancaster, the University of Oslo, and the Norwegian Computing Centre for the Humanities, Bergen, to provide a British counterpart to the Brown Corpus compiled by Henry Kučera and W. Nelson Francis for American English in the 1960s.

Its composition was designed to match the original Brown corpus in terms of its size and genres as closely as possible using documents published in the UK in 1961 by British authors.[1] Both corpora consist of 500 samples each comprising about 2000 words in the following genres:

Label Text category Brown Corpus LOB Corpus
A Press: reportage 44 44
B Press: editorial 27 27
C Press: reviews 17 17
D Religion 17 17
E Skills, trades and hobbies 36 38
F Popular lore 48 44
G Belles lettres, biography, essays 75 77
H Miscellaneous (documents, reports, etc.) 30 30
J Learned and scientific writings 80 80
K General fiction 29 29
L Mystery and detective fiction 24 24
M Science fiction 6 6
N Adventure and western fiction 29 29
P Romance and love story 29 29
R Humour 9 9
Total 500 500

The corpus has been also tagged, i.e. part-of-speech categories have been assigned to every word.[2]

References

[edit]
  1. ^ LOB Corpus Manual
  2. ^ Johansson, Stig. "CoRD | The Lancaster-Oslo/Bergen Corpus (LOB)". varieng.helsinki.fi.
[edit]