Transcription (linguistics)

Transcription in the linguistic sense is the systematic representation of language in written form. The source can either be utterances (speech or sign language) or preexisting text in another writing system.

Transcription should not be confused with translation, which means representing the meaning of a source language text in a target language (e.g. translating the meaning of an English text into Spanish), or with transliteration which means representing a text from one script in another (e.g. transliterating a Cyrillic text into the Latin script).

In the academic discipline of linguistics, transcription is an essential part of the methodologies of (among others) phonetics, conversation analysis, dialectology and sociolinguistics. It also plays an important role for several subfields of speech technology. Common examples for transcriptions outside academia are the proceedings of a court hearing such as a criminal trial (by a court reporter) or a physician's recorded voice notes (medical transcription). This article focuses on transcription in linguistics.

Phonetic vs. orthographic transcription

Broadly speaking, there are two possible approaches to linguistic transcription. Phonetic transcription focuses on phonetic and phonological properties of spoken language. Systems for phonetic transcription thus furnish rules for mapping individual sounds or phones to written symbols. Systems for orthographic transcription, by contrast, consist of rules for mapping spoken words onto written forms as prescribed by the orthography of a given language. Phonetic transcription operates with specially defined character sets, usually the International Phonetic Alphabet.

Which type of transcription is chosen depends mostly on the research interests pursued. Since phonetic transcription strictly foregrounds the phonetic nature of language, it is most useful for phonetic or phonological analyses. Orthographic transcription, on the other hand, has a morphological and a lexical component alongside the phonetic component (which aspect is represented to which degree depends on the language and orthography in question). It is thus more convenient wherever meaning-related aspects of spoken language are investigated. Phonetic transcription is doubtlessly more systematic in a scientific sense, but it is also harder to learn, more time-consuming to carry out and less widely applicable than orthographic transcription.

Transcription as theory

Mapping spoken language onto written symbols is not as straightforward a process as may seem at first glance. Written language is an idealisation, made up of a limited set of clearly distinct and discrete symbols. Spoken language, on the other hand, is a continuous (as opposed to discrete) phenomenon, made up of a potentially unlimited number of components. There is no predetermined system for distinguishing and classifying these components and, consequently, no preset way of mapping these components onto written symbols.

Literature is relatively consistent in pointing out the nonneutrality of transcription practices. There is not and cannot be a neutral transcription system. Knowledge of social culture enters directly into the making of a transcript. They are captured in the texture of the transcript (Baker, 2005).

Transcription systems

Transcription systems are sets of rules which define how spoken language is to be represented in written symbols. Most phonetic transcription systems are based on the International Phonetic Alphabet or, especially in speech technology, on its derivative SAMPA. Examples for orthographic transcription systems (all from the field of conversation analysis or related fields) are:

CA (Conversation Analysis)

Arguably the first system of its kind, originally sketched in (Sacks et al. 1978), later adapted for the use in computer readable corpora as CA-CHAT by (MacWhinney 2000). The field of Conversation Analysis itself includes a number of distinct approaches to transcription and sets of transcription conventions. These include, among others, Jefferson Notation. To analyze conversation, recorded data is typically transcribed into a written form that is agreeable to analysts. There are two common approaches. The first, called narrow transcription, captures the details of conversational interaction such as which particular words are stressed, which words are spoken with increased loudness, points at which the turns-at-talk overlap, how particular words are articulated, and so on. If such detail is less important, perhaps because the analyst is more concerned with the overall gross structure of the conversation or the relative distribution of turns-at-talk amongst the participants, then a second type of transcription known as broad transcription may be sufficient (Williamson, 2009).

Jefferson Notation

The Jefferson Notation System is a set of symbols, developed by Gail Jefferson, which is used for transcribing talk. Having had some previous experience in transcribing when she was hired in 1963 as a clerk typist at the UCLA Department of Public Health to transcribe sensitivity-training sessions for prison guards, Jefferson began transcribing some of the recordings that served as the materials out of which Harvey Sacks’ earliest lectures were developed. Over four decades, for the majority of which she held no university position and was unsalaried, Jefferson’s research into talk-in-interaction has set the standard for what became known as Conversation Analysis (CA). Her work has greatly influenced the sociological study of interaction, but also disciplines beyond, especially linguistics, communication, and anthropology.^[1] This system is employed universally by those working from the CA perspective and is regarded as having become a near-globalized set of instructions for transcription.^[2]

DT (Discourse Transcription)

A system described in (DuBois et al. 1992), used for transcription of the Santa Barbara Corpus of Spoken American English (SBCSAE), later developed further into DT2.

GAT (Gesprächsanalytisches Transkriptionssystem – Conversation Analytic transcription system)

A system described in (Selting et al. 1998), later developed further into GAT2 (Selting et al. 2009), widely used in German speaking countries for prosodically oriented conversation analysis and interactional linguistics

HIAT (Halbinterpretative Arbeitstranskriptionen – Semiinterpretative Working Transcriptions)

Arguably the first system of its kind, originally described in (Ehlich and Rehbein 1976) – see (Ehlich 1992) for an English reference - adapted for the use in computer readable corpora as (Rehbein et al. 2004), and widely used in functional pragmatics.

Transcription software

Transcription was originally a process carried out manually, i.e. with pencil and paper, using an analogue sound recording stored on, e.g., a Compact Cassette. Nowadays, most transcription is done on computers. Recordings are usually digital audio or video files, and transcriptions are electronic documents. Specialized computer software exists to assist the transcriber in efficiently creating a digital transcription from a digital recording. Among the most widely used transcription tools in linguistic research are:

ANVIL (Annotation of Video and Language Data): A tool specialising in transcription of multimodal interaction, see ANVIL-Website
Cielo24: A tool to create captions, indexes and transcripts for searchable metadata, see [1].
CLAN (Computerized Language Analysis): A tool mainly used for the transcription of child language acquisition data as in the CHILDES database, see CLAN page of the CHILDES website
ELAN (EUDICO Linguistic Annotator): A tool widely used for the transcription of sign language and the documentation of endangered languages, see ELAN page on the Language Archiving Technology portal
EXMARaLDA (Extensible Markup Language for Discourse Annotation): A tool widely used in discourse analysis, dialectology and sociolinguistics, see EXMARaLDA website
f4transkript: A tool used in social science which includes a free guide on transcription methodology on its site, see f4transkript website
FOLKER (FOLK Editor): A tool developed for the Research and Teaching Corpus of Spoken German (FOLK) and widely used in conversation analysis, see FOLKER page at the website of the Institute for German Language
Praat: A tool widely used in phonetics
Transcriber: A tool originally developed for the transcription of speech, see Transcriber website at SourceForge
Voxcribe: A tool for media (audio/video) transcription and captioning with embedded high speech recognition for English.

Other transcription software is developed for commercial sale.

References

^ Obituary – gail-jefferson.com
^ Davidson, C. (2007). Independent writing in current approaches to writing instruction: What have we overlooked? English Teaching: Practice and Critique. Volume 6, Number 1.http://edlinked.soe.waikato.ac.nz/research/files/etpc/files/2007v6n1art1.pdf

External links

[1] Obituary – gail-jefferson.com

[2] Davidson, C. (2007). Independent writing in current approaches to writing instruction: What have we overlooked? English Teaching: Practice and Critique. Volume 6, Number 1.http://edlinked.soe.waikato.ac.nz/research/files/etpc/files/2007v6n1art1.pdf

[1]

[2]