Swedish has a large vowel inventory, with nine vowels distinguished in quality and to some degree quantity, making 17 vowel phonemes in most dialects. Swedish pronunciation of most consonants is similar to that of other Germanic languages. Another notable feature is the pitch accent, which is unusual for European languages.
There are 18 consonant phonemes of which /ɧ/ and /r/ show considerable variation depending on both social and dialectal context. The voiceless palatal-velar fricative realization of /ɧ/ found in many dialects, including forms of the standard language, has so far not been found in any other language.
Contrary to the situation with Danish or Finnish, there is not a uniform nation-wide spoken Standard Swedish. Instead there are several regional standard varieties (acrolects or prestige dialects), i.e. the most intelligible or prestigious forms of spoken Swedish, each within its area. Within Sweden, actors, singers and TV-personalities are often advised to "neutralize" their dialects by assimilating Central (Svealand) Swedish pronunciation.
Swedish has 9 vowels that, as with many other Germanic languages, come in long and short pairs. The length covaries with the quality of the vowels, as shown below, with short variants being more centered and lax. Traditionally, length has been viewed as the primary distinction, with quality being secondary. No short vowels appear in open stressed syllables. The front vowels appear in rounded-unrounded pairs.
/ɛː/, /ɛ/ (in stressed syllables), /øː/ (with a few exceptions), and /œ/ are lowered to [æː], [æ], [œ̞ː] and [œ̞], respectively, when preceding /r/. In some Swedish varieties, traditionally spoken around Gothenburg and in Östergötland, but today more widely spoken e.g. in Stockholm and especially by younger speakers, [œ̞] is used in other contexts as well. Words like fördömande ('judging') and fördummande ('dumbing') are then often pronounced similarly, if not identically. The use of [æ] instead of /ɛː/ and /ɛ/ was earlier more rural and dialectal in e.g. Östergötland, but has during the last decades' dialectal change been common in Eastern dialects around Stockholm and by younger speakers.
Unstressed /ɛ/ is realized as [ə], i.e. a basic schwa. This feature is common to most varieties of Swedish. (e.g. begå, 'to commit' /bɛˈɡoː/ → [bəˈɡoː]).
In many central and eastern areas (including Stockholm), the contrast between short /ɛ/ and /e/ is lost, especially the short variants, except before /r/ when the small vowel distinction between the words herre 'master' and märr 'mare' is kept. The loss of this contrast has the effect that hetta ('heat') and hätta ('cap') are pronounced the same. Long /ɑː/ is pronounced with a small amount of lip-rounding.
In a number of dialects of Swedish, /ʉː/ is a central vowel. However, in Central Standard Swedish it is more front. The primary difference between the two high front rounded vowels /ʉː/ and /yː/ is that /ʉː/ is articulated with compressed lips, [ɪᵝ], while /yː/ uses protruded lips, [iʷ]./uː/ is also compressed, [ɯᵝ].
There is some variation in the interpretations of vowel length's phonemicity. Elert (1964), for example, treats vowel quantity as its own separate phoneme (a "prosodeme") so that long and short vowels are allophones of a single vowel phoneme.
Patterns of diphthongs of long vowels occur in three major dialect groups. The Central Swedish glide can be accompanied by a slight frication in the pronunciation of the high vowels /iː/, /yː/, /ʉː/, and /uː/, which are [ij], [yɥ], [ʉβ], and [uw]. Furthermore, /eː/, /øː/ and /oː/ are often realized as centering diphthongs [eə], [øə] and [oə]. In Southern Swedish dialects, particularly in Scania, the diphthongs are preceded by a rising of the tongue from a central position so that /ʉː/ and /ɑː/ are realized as [eʉ] and [aɑ] respectively, i.e. rising diphthongs. A third type of distinctive diphthongs occur in the dialects of Gotland. The pattern of diphthongs is more complex than those of southern and eastern Sweden; /eː/, /øː/ and /ʉː/ tend to rise while and /ɛː/ and /oː/ fall; /uː/, /iː/, /yː/ and /ɑː/ are not diphthongized at all.
Initial fortis stops (/p, t, k/) are aspirated in stressed position, but unaspirated when preceded by /s/ within the same morpheme. Hence ko ('cow') is [kʰuː], but sko ('shoe') becomes [skuː]. Compare English[kʰuːɫ] ('cool') vs [skuːɫ] ('school'). Preaspiration of medial and final fortis stops, including the devoicing of preceding sonorants is common, though its length and normativity varies from dialect to dialect, being optional (and idiolectal) in Central Standard Swedish but obligatory in, for example, the Swedish dialects of Gräsö,Vemdalen, and Arjeplog. In Gräsö, preaspiration is blocked in certain environments (such as an /s/ following the fortis consonant or a morpheme boundary between the vowel and the consonant), while it is a general feature of fortis medial consonants in Central Standard Swedish. When not preaspirated, medial and final fortis stops are simply unaspirated. In clusters of fortis stops, the second "presonorant" stop is unaspirated and the former patterns with other medial final stops (that is, it is either unaspirated or is preaspirated.
The phonetic attributes of preaspiration also varies. In the Swedish of Stockholm, preaspiration is often realized as a fricative subject to the character of surrounding vowels or consonants so that it may be labial, velar, or dental; it may also surface as extra length of the preceding vowel. In the province of Härjedalen, though, it resembles [h] or [x]. The duration of preaspiration is highest in the dialects of Vemdalen and Arjeplog. Helgason notes that preaspiration is longer after short vowels, in lexically stressed syllables, as well as in pre-pausal position.
The Swedish fricatives /ɕ/ and /ɧ/ are often considered to be the most difficult aspects of Swedish pronunciation for foreign students. The combination of occasionally similar and rather unusual sounds as well as the large variety of partly overlapping allophones of /ɧ/ often presents difficulties for non-natives in telling the two apart. The existence of a third sibilant in the form of /s/ tends to confuse matters even more, and in some cases realizations that are labiodental can also be confused with /f/. In Finland Swedish, /ɕ/ is an affricate: [t͡ɕ] or [t͡ʃ].
The Swedish phoneme /ɧ/ (the "sje-sound" or voiceless postalveolar-velar fricative) and its alleged coarticulation is a difficult and complex issue debated amongst phoneticians. Though the acoustic properties of its [ɧ] allophones are fairly similar, the realizations can vary considerably according to geography, social status, age, gender as well as social context and are notoriously difficult to describe and transcribe accurately. Most common are various [ɧ]-like sounds, with [ʂ] occurring mainly in northern Sweden and [ɕ] in Finland. A voiceless uvular fricative, [χ], can sometimes be used in the varieties influenced by major immigrant languages like Arabic and Kurdish. The different realizations can be divided roughly into the following categories:
"Dark sounds" - [ɧ], and [x], commonly used in the Southern Standard Swedish. Some of the varieties specific, but not exclusive, to areas with a larger immigrant population commonly realize the phoneme as a voiceless uvular fricative[χ].
"Light sounds" - [ʂ], used in the northern varieties and [ʃ], and [ɕ] (or something in between) in Finland Swedish.
Combination of "light" and "dark" - darker sounds are used as morpheme initials preceding stressed vowels (sjuk, station; "sick", "station"), while the lighter sounds are used before unstressed vowels and at the end of morphemes (bagage, dusch; 'baggage', 'shower').
/v/ and /j/ are pronounced with weak friction and function phonotactically with the sonorants.
/r/ has distinct variations in Standard Swedish. The realization as an alveolar trill occurs among most speakers only in contexts where emphatic stress is used. In Central Swedish, it is often pronounced as a fricative (transcribed as [ʐ]) or approximant (transcribed as [ɹ]), which is especially frequent in weakly articulated positions such as word-finally and somewhat less frequent in stressed syllable onsets, in particular after other consonants. It may also be an apico-alveolar tap. One of the most distinct features of the southern varieties, which they share with Danish, are the use of uvular trills or voiced fricatives, [ʀ], [ʁ] for the /r/-phoneme.
In most varieties of Swedish that use an alveolar /r/ (in particular, the central and northern forms), the combination of /r/ with dental consonants (/t, d, n, l, s/) produces retroflex consonant realizations, a recursive sandhi process called "retroflexion". Thus, /kɑːrta/ ("map") is realized as [kʰɑːʈa], /nuːrd/ ("north") as [nuːɖ], /vɛːnern/ ('Vänern') as [vɛːnəɳ], and /fɛrsk/ ('fresh') as [fæʂːk]. The combination of /r/ and /l/ does not uniformly cause retroflexion and sorl ('murmur') may be pronounced [soːɭ], [soːrl], or [soːl].
As the table to the right shows, this process is not limited by word boundaries, though there is still some sensitivity to the type of boundary between the /r/ and the dental in that retroflexion is less likely with boundaries higher up in the prosodic hierarchy. In the southern varieties, which use a uvular /r/, retroflex realizations don't occur. For example, /kɑːrta/ ('map') is realized as [kʰɑʁta], etc. A double sequence /rr/ usually won't trigger retroflexion so that spärrnät ('anti-sub net') is pronounced [ˈspærːˌnɛːt]. The process of retroflexion is not limited to just one dental, and e.g. först is pronounced [fœ̞ʂʈ].
Variations of /l/ are not as common, though some phonetic variation exists, such as a retroflex flap[ɽ] that exists as an allophone in proximity to a labial or velar consonant (e.g. glad, 'glad') or after most long vowels.
In casual speech, the nasals tend to assimilate to the place of articulation of a following obstruent so that, for example, han kom ('he came') is pronounced [haŋ ˈkʰɔmː].
As in English, there are many Swedish word pairs that are differentiated by stress:
formel[ˈfɔrːmɛl] — 'formula'
formell[fɔrˈmɛlː] — 'formal'
Stressed syllables differentiate two tones, often described as pitch accents, or tonal word accents by Scandinavian linguists. They are called acute and grave accent, tone/accent 1 and tone/accent 2, or Single Tone and Double Tone. The actual realizations of these two tones varies from dialect to dialect. In Standard Central Swedish, for example, the acute accent has a low tone while the grave accent has a high one. Generally, the grave accent is characterized by a later timing of the intonational pitch rise as compared with the acute accent; the so-called two-peaked dialects (such as Central and Western Swedish) also have another, earlier pitch peak in the grave accent, hence the term "two-peaked".
The phonemicity of this tonal system is demonstrated in the nearly 300 pairs of two-syllable words differentiated only by their use of either grave or acute accent. Outside of these pairs, the main tendency for tone is that the acute accent appears in monosyllables (since the grave accent cannot appear in monosyllabic words) while the grave accent appears in polysyllabic words. Polysyllabic forms resulting from declension or derivation also tend to have a grave accent except when it is the definite article that is added. This tonal distinction has been present in Scandinavian dialects at least since Old Norse though a greater number of polysyllables now have an acute accent. These are mostly words that were monosyllabic in Old Norse, but have subsequently become disyllabic, as have many loanwords. For example, Old Norse kømr ('comes') has become kommer in Swedish (with an acute accent).
Acute accent: anden[ˈa᷇ndɛ̀n] or [ˈan˥˧dɛn˩] — 'the duck' (from and 'duck')
In Central Swedish, this is a high, slightly falling tone followed by a low tone; that is, a single drop from high to low pitch spread over two syllables.
Grave accent: anden[ˈa᷆ndɛ̂n] or [ˈan˧˩dɛn˥˩] — 'the spirit' (from ande 'spirit')
In Central Swedish, a mid falling tone followed by a high falling tone; that is, a double falling tone.
The exact realization of the tones also depends on the syllable's position in an utterance. For instance, at the beginning of an utterance, the acute accent may have a rising rather than slightly falling pitch on the first syllable. Also, these are word tones that are spread across the syllables of the word. In the grave accent, trisyllabic words the second fall in pitch is distributed across the second and third syllables, with the result that the pitches are mid–low falling, high–mid falling, and low:[clarification needed]
Grave-accent trisyllable: flickorna[ˈflɪ᷆kːʊ᷇ɳà] or [ˈflɪ˧˩kːʊ˥˧ɳa˩] — 'the girls'
The position of the tone is dependent upon stress: The first stressed syllable has a high or falling tone, as does the following syllable(s) in grave-accented words.
Prosody in Swedish often varies substantially between different dialects including the spoken varieties of Standard Swedish. As in most languages, stress can be applied to emphasize certain words in a sentence. To some degree prosody may indicate questions, although less so than in English.
In most Finland-Swedish varieties, however, the distinction between grave and acute accent is missing.
At a minimum, a word must consist of either a long vowel or a short vowel and a long consonant. Like many other Germanic languages, Swedish has a tendency for closed syllables with a relatively large amount of consonant clusters in initial as well as final position. Though not as complex as that of most Slavic languages, examples of up to 7 consecutive consonants can occur when adding Swedish inflections to some foreign loanwords or names, and especially when combined with the tendency of Swedish to make long compound nouns. The syllable structure of Swedish can therefore be described with the following formula:
This means that a Swedish one-syllable morpheme can have up to three consonants preceding the vowel that forms the nucleus of the syllable, and three consonants following it. Examples: skrämts[skrɛmːts] (verb 'scare' past participle, passive voice) or sprängts[sprɛŋːts] (verb 'explode' past participle, passive voice). All but one of the consonant phonemes, /ŋ/, can occur at the beginning of a morpheme, though there are only 6 possible three-consonant combinations, all of which begin with /s/, and a total of 31 initial two-consonant combinations. All consonants except for /h/ and /ɕ/ can occur finally, and the total amount of possible final two-consonant clusters is 62.
In some cases this can result in near-unpronounceable combinations, such as in västkustskt, consisting of västkust ('west coast') with the adjective suffix-sk and the neuter suffix -t.
Central Standard Swedish and most other Swedish dialects feature a rare "complementary quantity" feature wherein a phonologically short consonant follows a long vowel and a long consonant follows a short vowel; this is true only for stressed syllables and all segments are short in unstressed syllables. This arose from the historical shift away from a system with a four-way contrast (that is, VːCː, VC, VːC, and VCː were all possible) inherited from Proto-Germanic to a three-way one (VC, VːC, and VCː), and finally the present two-way one; certain Swedish dialects have not undergone these shifts and exhibit one of the other two phonotactic systems instead. In literature on Swedish phonology, there are a number of ways to transcribe complementary relationship, including:
A length mark ː for either the vowel (/viːt/), the consonant (/vitː/), or both.
Gemination of the consonant (/vit/ vs. /vitt/)
Diphthongization of the vowel (/vijt/ vs. /vit/)
The position of the stress marker (/viˈt/ vs. /vitˈ/)
With the conventional assumption that medial long consonants are ambisyllabic (that is, penna, 'pen', is syllabified as [ˈpɛn.na]), all stressed syllables are thus "heavy". In unstressed syllables, the distinction is lost between /u/ and /o/ or between /e//ɛ/. With each successive post-stress syllable, the number of contrasting vowels decreases gradually with distance from the point of stress; at three syllables from stress, only [a] and [ə] occur.
The sample text is a reading of The North Wind and the Sun. The transcriptions are based on the section on Swedish found in The Handbook on the International Phonetic Association. The broad transcription is phonemic while the narrow is phonetic.
Nordanvinden och solen tvistade en gång om vem av dem som var starkast. Just då kom en vandrare vägen fram insvept i en varm kappa. De kom då överens om att den som först kunde få vandraren att ta av sig kappan, han skulle anses vara starkare än den andra. Då blåste nordanvinden så hårt han nånsin kunde, men ju hårdare han blåste desto tätare svepte vandraren kappan om sig, och till slut gav nordanvinden upp försöket. Då lät solen sina strålar skina helt varmt och genast tog vandraren av sig kappan och så var nordanvinden tvungen att erkänna att solen var den starkaste av de två.
Engstrand, Olle (1999), "Swedish", Handbook of the International Phonetic Association: A Guide to the usage of the International Phonetic Alphabet., Cambridge: Cambridge University Press, pp. 140–142, ISBN0-521-63751-1
Engstrand, Olle (2004), Fonetikens grunder (in Swedish), Lund: Studenlitteratur, ISBN91-44-04238-8
Fant, G. (1983), "Feature analysis of Swedish vowels - a revisit", Speech, Music and Hearing Quarterly Progress and Status Report24 (2-3): 1–19
Garlén, Claes (1988), Svenskans fonologi (in Swedish), Lund: Studenlitteratur, ISBN91-44-28151-X
Gårding, E. (1974), Kontrastiv prosodi, Lund: Gleerup
Hamann, Silke (2003), The Phonetics and Phonology of Retroflexes, Utrecht, ISBN90-76864-39-X
Helgason, Pétur (1998), "On-line preaspiration in Swedish: implications for historical sound change", Proceedings of Sound Patterns of Spontaneous Speech98, pp. 51–54
Helgason, Pétur (1999a), "Preaspiration and sonorant devoicing in the Gräsö dialect: preliminary findings.", Proceedings of The Swedish Phonetics Conference 1999, Gothenberg Papers in Theoretical Linguistics, Göteborg University, pp. 77–80
Helgason, Pétur (1999b), "Phonetic preconditions for the development of normative preaspiration", Proceedings of the 14th International Congress of Phonetic Sciences, San Francisco, pp. 1851–1854
Liberman, Anatoly (1978), "Pseudo-støds in Scandinavian languages", Orbis27: 52–76
Liberman, Anatoly (1982), Germanic Accentology, 1: The Scandinavian Languages, Minneapolis: University of Minnesota Press
Petrova, Olga; Plapp, Rosemary; Ringen, Ringen; Szentgyörgyi, Szilárd (2006), "Voice and aspiration: Evidence from Russian, Hungarian, German, Swedish, and Turkish", The Linguistic Review23: 1–35, doi:10.1515/tlr.2006.001
Riad, T. (1992), Structures in Germanic Prosody, Department of Scandinavian Languages, Stockholm University
Ringen, Catherine; Helgason, Pétur (2004), "Distinctive [voice] does not imply regressive assimilation: evidence from Swedish", International Journal of English Studies: Advances in Optimality Theory4 (2): 53–71
Schaeffler, Felix (2005), "Phonological Quantity in Swedish Dialects", Phonum10
Tronnier, Mechtild (2002), "Preaspiration in Southern Swedish dialects", Proceedings of Fonetik44 (1): 33–36
Wretling, P.; Strangert, E.; Schaeffler, F. (2002), "Quantity and Preaspiration in Northern Swedish Dialects", in Bel, B; Marlien, I., Proceedings of the Speech Prosody 2002 conference, Aix-en-Provence: Laboratoire Parole et Langage, pp. 703–706