COCOA (digital humanities)

From Wikipedia, the free encyclopedia
Jump to: navigation, search

COCOA was an early piece of word processing software and associated file format for digital humanities, then known as humanities computing. It was approximately 4000 punched cards of FORTRAN and created in the late 1960s and early 1970s at University College London and the Atlas Computer Laboratory. Functionality included word-counting and concordance building.[1][2][3][4]

Oxford Concordance Program[edit]

The Oxford Concordance Program (OCP) format was a direct descendent of COCOA developed at Oxford University. The Oxford Text Archive holds items in this format.[5]

Later developments[edit]

The COCOA file format bears at least a passing similarity to the later markup languages such as SGML and XML. A noticeable difference with its successors is that COCOA tags are flat and not tree structured. In that format, every information type and value encoded by a tag should be considered true until the same tag changes its value. Members of the Text Encoding Initiative community maintain legacy support for COCOA,[6][7] although most in-demand texts and corpora have already been migrated to more widely understand formats such as TEI XML[8]

Example[edit]

The play The Medea of Euripides by Euripides in ancient greek is encoded in a 1976 Oxford Text Archive edition[9] as:

<      THE MEDEA OF EURIPIDES      >
<P MHD.>
<S TR>
<V 0001>
EIQ' WFEL' $ARGOUV MH DIAPTASQAI SKAFOV
<V 0002>
$KOLCWN EV AIAN KUANEAV $SUMPLHGADAV,
<V 0003>
MHD' EN NAPAISI $PHLIOU PESEIN POTE
<V 0004>
TMHQEISA PEUKH, MHD' ERETMWSAI CERAV
<V 0005>
ANDRWN ARISTWN, O*I TO PAGCRUSON DERAV
<V 0006>
$SPELIA: METHLQON. OU GAR AN DESPOIN' EMH
...
<      END OF THE MEDEA      >

Note the XML-like tags at the start with metadata, numbered 'V' tags between each line and the creative use of whitespace on the first and last lines. The actually text is a pre-Unicode transliteration into the Roman alphabet, compare to the Greek source on wikisource.

References[edit]

  1. ^ Paul E. Corcoran (November 1974). "COCOA: A FORTRAN Program for Concordance and Word-count Processing of Natural Language Texts". Behavior Research Methods & Instrumentation (Springer-Verlag) 6 (6): 566. doi:10.3758/BF03201351. 
  2. ^ Colin Day and Ian Marriott (February 1976). "Software Reviews: COCOA: A Word Count and Concordance Generator". Computers and the Humanities (Kluwer Academic Publishers) 10 (1): 56. doi:10.1007/BF02399143. 
  3. ^ D. B. Russell (1965). "COCOA - A Word Count and Concordance Generator". Associates Technology Literature Applications Society. Retrieved 20 October 2013. 
  4. ^ Susan Hockey. "The History of Humanities Computing". University of Illinois. Archived from the original on 18 September 2013. Retrieved 20 October 2013. 
  5. ^ "Concordia discordantium canonum ac primum de iure naturae et constitutionis". University of Oxford Text Archive. Retrieved 20 October 2013. 
  6. ^ James Cummings, Sebastian Rahtz (2010). "This script is used to convert COCOA to TEI" (XSL). Oxford University. Retrieved 20 October 2013. 
  7. ^ https://github.com/TEIC/Stylesheets/tree/master/cocoa
  8. ^ http://www.helsinki.fi/varieng/CoRD/corpora/HelsinkiCorpus/HC_XML.html
  9. ^ http://ota.ox.ac.uk/desc/2414