= Menotec =

Menotec was an infrastructure project funded by the Norwegian Research Council (2010–2012) with the aim of transcribing and annotating a text corpus of Old Norwegian texts.

==Description==
The transcribed texts have been (and will be) published in the Medieval Nordic Text Archive, while the annotated texts have been published in the treebank of the PROIEL project, as well as being made accessible through the INESS portal. The funding for the project lasted for three years 2010–2012, but the project work continues in new contexts.

As a first step, transcriptions were made of eight central Old Norwegian law manuscripts containing approx. 480,000 words. The transcriptions were made by Anna C. Horn, and afterwards proofread by several colleagues at the University of Oslo:

- Holm perg 34 4to (hand f, ca. 1276–1300), partly by the scribe Eiríkr Þróndarson (hand f), often regarded as the codex optimus of Landslǫg Magnúss Hákonarsonar (The Law-Code of Magnús Hákonarson the Lawmender)

- AM 78 4to (ca. 1276–1300)

- AM 302 fol (ca. 1300), by the scribe Þorgeirr Hákonarson

- AM 305 fol (hand a, ca. 1300), by the scribe Þorgeirr Hákonarson (hand a)

- AM 56 4to (ca. 1300), by the scribe Þorgeirr Hákonarson

- Holm perg 30 4to (hand b, ca. 1300–1325)

- AM 60 4to (hand b, ca. 1320), used as the base codex for Landslǫg Magnúss Hákonarsonar in the edition of Norges Gamle Love, vol. 2

- Upps DG 8 I (hands a and b, ca. 1300–1350)

All but AM 78 4to include Landslǫg Magnúss Hákonarsonar (concluded in 1276), and most of them also contain several other law texts.

As a second step, a full linguistic annotation was made of four major Old Norwegian manuscripts:

- The Old Norwegian Homily Book, Gammelnorsk homiliebok, in AM 619 4to (ca. 1200–1225)

- The legendary saga of St Olaf, Óláfs saga ins helga, in Upps DG 8 II (ca. 1225–1250)

- Strengleikar in Upps DG 4–7 4to (ca. 1270)

- Landslǫg Magnúss Hákonarsonar in the above-mentioned ms. Holm perg 34 4to (ca. 1276–1300)

These four mss. have been annotated morphologically (adding the lemma and the grammatical form of each word) as well as syntactically. The syntactic annotation is based on dependency analysis, as this has been developed in the PROIEL project.

The corpus of these four mss. counts approx. 200,000 words, but in the coming years, it will be extended by at least ca. 50,000 words. Menotec is the first project offering a syntactic annotation of Old Norwegian. On the PROIEL site, the Old Norwegian texts will join a central Old Icelandic work, the Poetic Edda in GKS 2365 4to (a manuscript often referred to as Codex Regius). The Eddic poems have been annotated along the same lines as the texts in Menotec. Furthermore, several other Early Germanic and Romance texts will be found on the PROIEL site.

The Menotec project was led by Christian-Emil Ore at the University of Oslo, and included several other participants at this university, Karl G. Johansson, Anna C. Horn, Signe Laake, Kari Kinn, Dag Haug and Hanne Eckhoff. The University of Bergen was a partner in the project, and at this university, Odd Einar Haugen was leading the work of linguistic annotation. Also from Bergen, Fartein Th. Øverland, participated in the project, and from Iceland, Haraldur Bernharðsson and Eirikur Kristjánsson. The name Menotec is not actually an acronym, but is derived from the project name Menota (which in turn is an acronym for the Medieval Nordic Text Archive).

Guidelines for the annotation have been published by Haugen and Øverland and are available in parallel versions in Norwegian and English, Retningslinjer and Guidelines. These guidelines explain the conventions of the morphological and syntactic annotation and also give a pragmatic introduction to dependency analysis for Old Norwegian, covering a wide range of annotation problems which arose during the project.

Further information on the encoding and annotation of Medieval Nordic sources can be found on the site of the Medieval Nordic Text Archive. Since the texts have been transcribed on a diplomatic level, some special characters are needed for the display of the texts. The encoding of these characters follow the recommendations of the Medieval Unicode Font Initiative, and several suitable fonts can be downloaded free of charge from the website of this project.

== See also ==
- Text corpus
- Dependency grammar
- Part-of-speech tagging
