Metrical phonology is a theory of stress or linguistic prominence. The innovative feature of this theory is that the prominence of a unit is defined relative to other units in the same phrase. For example, in the most common pronunciation of the phrase "doctors use penicillin" (if said out-of-the-blue), the syllable '-ci-' is the strongest or most stressed syllable in the phrase, but the syllable 'doc-' is more stressed than the syllable '-tors'. Previously, generative phonologists and the American Structuralists represented prosodic prominence as a feature that applied to individual phonemes (segments) or syllables. This feature could take on multiple values to indicate various levels of stress. Stress was assigned using the cyclic reapplication of rules to words and phrases.
Metrical phonology holds that stress is separate from pitch accent and has phonetic effects on the realization of syllables beyond their intonation, including effects on their duration and amplitude. The perceived stress of a syllable results from its position in the metrical tree and metrical grid for the phrase it appears in.
Linguistic prominence in metrical phonology is partially determined by the relations between nodes in a branching tree, in which one node is Strong (S) and the other node or nodes are Weak (W). The labels Strong and Weak have no inherent phonetic realization, and only have meaning relative to the rest of the labels in the tree. A Strong node is stronger than its Weak sister node. The most prominent syllable in a phrase is the one that does not have any Weak nodes above it. This syllable is called the Designated Terminal Element. In the example tree (1), the syllable '-ci-' is the Designated Terminal Element.
Metrical trees allow us to change the stress pattern for a phrase by switching S and W sister nodes. The tree in (1) represents the metrical structure for the sentence "Doctors use penicillin". when the sentence is providing all new information. This is called broad focus, and might be used in response to a question like "What did you learn at the hospital today?" The same metrical structure would be used when the sentence has narrow focus on the word 'penicillin'; for example, if it was used in response to a question like "What do doctors use to treat that disease?". However, we need a new metrical structure to put narrow focus on the word 'doctors', for example, if the phrase is used in response to the question "Who uses penicillin?" In this case, the S and W nodes at the intermediate phrase (ip) level have to switch, resulting in (2).
In metrical phonology there has been debate about whether nodes on metrical trees must have two children, making them binary branching,  or whether they can have any number of children, making them n-ary branching. Proponents of binary branching trees have claimed that such trees can constrain the restructuring of very long and very short constituents because new nodes created in this restructuring have to correspond to nodes in the original tree. Proponents of n-ary branching trees point out that only multiple branches allow a limited number of tree levels, which can correspond to predetermined levels of prosodic constituents, whereas binary branching trees require intermediate levels that do not correspond to any prosodic constituent. A number of levels of prosodic constituents have been proposed, including: moras, syllables, feet, phonological words, clitic groups, phonological phrases, intermediate phrases, intonational phrases, and phonological utterances. The relations between prosodic constituents at different levels is commonly thought to be governed by the Strict Layer Hypothesis (SLH). This hypothesis states that in metrical trees, all prosodic constituents at a particular level consist exclusively of constituents from the level below. The SLH forbids a number of types of tree structures, including trees in which: a node has two parents in the level above, a node has two or more different types of children, a node has children from a level that is not the level immediately below it, a node does not correspond to any of the specified levels, or a node has children of the same type as itself.
The various levels of the prosodic hierarchy are independently justified by the phonological phenomena that make reference to them. For instance, in English the sounds /p/, /t/, and /k/ are aspirated (followed by a puff of air) only if they are the first segment in a foot. Similarly, in the Gorgia Toscana variety of Italian, the intonation phrase is the domain of a rule that changes voiceless plosives (/p/, /t/, /k/) between vowels into fricative consonants, like /θ/ (th) and /h/.
In addition to describing prominence relations between words, metrical trees can also describe prominence relations within words. Indeed, a set of rules developed by Liberman and Prince can be used to quite accurately predict stress in English words. Their Lexical Category Prominence Rule states that the second node in a pair of sister nodes is labeled W unless one of a number of conditions are met, such as the node branching or dominating a particular suffix, in which case it is labeled S. Allowable tree structures and node labels for a particular word in Liberman and Prince's system are constrained by the two-value feature [± stress], which can be assigned to segments or syllables by separate rules that refer to the number and type of segments in the syllable and the syllable's position in the word. Syllables that are [- stress] can only be immediately dominated by a W node. However, syllables that are [+ stress] can be immediately dominated by S or W nodes.
In a Metrical grid, all the words in the phrase are arranged along the bottom and the rows of the grid indicate different levels of prominence, as in (3).
(3) Example metrical grid
The higher the column of Xs above a syllable, the more prominent the syllable is. The metrical grid and the metrical tree for a particular utterance are related in such a way that the Designated Terminal Element of an S node must be more prominent than the Designated Terminal Element of its sister W node. So in (3), the metrical grid for the utterance in (1), '-ci-' must be more prominent than 'doc-' because '-ci-' is the Designated Terminal Element of the highest S node and 'doc-' is the Designated Terminal Element of its sister W node.
The structure of the metrical grid explains a number of otherwise surprising features of prominence patterns in language. For example, the main stress in English phrases may be placed several syllables away from the end of the phrase, even though the rule assigning this stress looks for a lexically stressed syllable near this boundary. Using a metrical grid, this rule can simply apply to the rightmost element in the highest row of the grid. Therefore, what seemed to be a non-local application of the phrasal stress rule is reinterpreted as the local application of the rule to the highest row of the metrical grid.
Metrical grids were originally developed to handle a phenomenon that appears in some languages, including English, German, and Masoretic Hebrew, in which stress shifts to avoid a 'stress clash'. A stress clash can occur when two stressed syllables are too close to each other. For example, the word 'nineteen' spoken in isolation has stress on the second syllable. But when it is placed before 'girls' the stress on 'nineteen' can shift to the first syllable. Two syllables exhibit stress clash if there are two successive rows in the grid in which their columns are adjacent (i.e. there is no X between them). For example, in grid (4) the columns for 'teen' and 'girls' are adjacent in both the first and second rows, indicating a stress clash.
(4) Pre-stress-shift metrical grid
Stress clashes can be resolved by the Rhythm Rule, which reverses the S-W relation for some pair of sister nodes, as long as such a reversal does not put a Designated Terminal Element of an Intonational Phrase under any W node, and doesn't put a [- stress] syllable directly under an S node. In (4) the W and S nodes over 'nine-' and '-teen' can be reversed, leading to the non-clashing grid in (5).
(5) Post-stress-shift metrical grid
This process is optional, and seems to be applied more often in some circumstances than others.
- Right-dominant vs. left-dominant: In a right-dominant language nodes on the right are labeled S, while in a left-dominant language nodes on the left are labeled S.
- Bounded vs. unbounded: In a bounded language the main stress appears a fixed distance from the word boundary and the secondary stress appears at fixed intervals from other stressed syllables. In an unbounded language the main stress is drawn to 'heavy' syllables (syllables with long vowels and/or consonants at the end of the syllable). Within bounded languages, two more parameters apply: left-to-right vs. right-to-left and quantity sensitive vs. insensitive.
- Left-to-right vs. right-to-left: In a left-to-right language metrical trees are constructed starting at the left edge of the word, while in a right-to-left language, they start at the right edge of the word.
- Quantity-sensitive vs. quantity-insensitive: In a quantity-sensitive language a W node cannot dominate a heavy syllable, while in a quantity-insensitive language tree construction is not influenced by the internal makeup of the syllables
Hayes (1995) describes metrical parameters that can analyse/predict word-level stress placement:
- Quantity-sensitive vs. quantity-insensitive: whether stress is sensitive to syllable weight
- Foot Type: Iambs or trochees.
- Parsing Directionality: whether the feet are built from the left edge of the word to the right or right to left
- Main Stress: does the stress fall on towards the right or left edge of the word
- Extrametricality: is there a unit consistently ignored for stress assignment, such as a final consonant, mora, syllable, or foot.
Hierarchical patterns of prominence like those represented in metrical trees can also apply to rhythm in music. The prominence level of a note is determined by the relative prominence of all the nodes above it. The timing of notes also depends on the metrical tree for a particular tune. Each node at the bottom level of the tree (terminal nodes) receives a beat. Empty terminal nodes correspond to rests or form part of a note that spans several beats. Syncopation in music can result when relatively strong nodes are empty.
Metrical phonology offers a number of advantages over a system representing stress as a feature that applies to individual segments or syllables, without reference to the other syllables in a phrase. Creators of traditional feature systems posited the stress feature, which differed from other phonological features in several key ways. For instance, the feature stress had an arbitrary number of values or levels, rather than two or some justified number more than two. In addition, the non-primary stress values in these systems were only defined relative to the primary stress value, and did not have local acoustic or articulatory effects. By not treating stress as a feature of an individual segment, metrical phonology avoids the inexplicable differences between the stress feature and other phonological features.
Metrical phonology also correctly predicts the ambiguity between broad and narrow focus. There are two possible metrical patterns for two-word phrases: S-W and W-S. However, there are three possible patterns of focus for such phrases: narrow focus on the first word, narrow focus on the second word, and broad focus. For instance, the phrase "Gus skied" can either be pronounced "GUS skied" (S-W) or "Gus SKIED" (W-S). These two realizations are the only options for answering the three questions: Who skied? (narrow focus on 'Gus'), What did Gus do? (narrow focus on 'skied'), and What happened yesterday? (broad focus).
Finally, metrical phonology is consistent with patterns of deaccenting in which accents can shift both left and right. This is because swapping S and W nodes will cause stress to move left if the S node was originally on the right, and move right if it was originally on the left. Such bi-directional movement is more difficult to predict under a stress-shift rule, which would specify the direction of movement.
- Liberman, Mark (1975). "The intonational system of English". PhD thesis, MIT, Distributed 1978 by IULC.
- Liberman, Mark; Prince, Alan (1977). "On stress and linguistic rhythm". Linguistics Inquiry 8: 249–336.
- Chomsky, Noam; Halle, Morris (1968). "The sound pattern of English". Harper and Row: New York.
- Nespor, Marina; Vogel, Irene (1982). "Prosodic domains of external sandhi rules". H.van der Hulst and N. Smith (eds.), The structure of Phonological Representations. Part I. Foris Publications: Dordrecht: 225–255.
- Nespor, Marina; Vogel, Irene (1986). "Prosodic phonology". Foris Publications: Dordrecht.
- Beckman, Mary (1986). "Stress and Non-Stress Accent". Foris Publications: Dordrecht.
- Selkirk, Elizabeth (1984). "Phonology and Syntax: The relation between sound and structure". MIT Press: Cambridge, MA.
- Halle, Morris (1973). "Stress rules in English: A new version". Linguistic Inquiry 4: 451–464.
- Hayes, Bruce (1995). "Metrical stress theory". The University of Chicago Press: Chicago.
- Hayes, Bruce (1981). "A metrical theory of stress rules". PhD Thesis, MIT, Distributed by Indiana University Linguistics Club.
- Hayes, Bruce (1995). Metrical Stress Theory: Principles and Case Studies. London: The University of Chicago Press, Ltd.
- Martin, James (1972). "Rhythmic (hierarchical) versus serial structure in speech and other behavior". Psychological Review 79(6): 487–509.
- Ladd, D. Robert (1996). "Intonational Phonology". Cambridge University Press: Cambridge, UK.
- Ladd, D. Robert (1980). "The structure of intonational meaning: Evidence from English". Indiana University Press: Bloomington.