Aspirated consonants after S

I understand that aspirated consonants (or unvoiced stops) after ‘S’ are pronounced as unaspirated. For instance, STAIN is pronounced as SDAIN, EXTEND is pronounced as EXDEND.

However, for the word SIXTEEN, there seems to be inconsistencies. According to , the American version of SIXTEEN is pronounced with an aspirated T. Why is that? Are there any other similar words?

Is there a list of such rules and exceptions available out there?


All stops are ordinarily pronounced without aspiration following /s/. But if you do aspirate such a stop nobody will hear it.

In the first place, I think you are confusing aspiration and voicing.

  • Voicing is vocal chord vibration—‘hum’. In IPA notation, voiced and voiceless consonants are distinguished with different glyphs. The ‘official’ distinction between /t/ and /d/ is one of voicing, not aspiration: /t/ is voiceless, /d/ is voiced.
  • Aspiration is a brief puff of air after a stop is released, before the onset of voicing on the following vowel. In IPA notation, aspiration is notated with a superscript <h>. Aspirate [b], for instance, is notated [bh]. The specific absence of aspiration, as snailboat tells us, may be notated with =: [t=] is unaspirated [t]. At the site you link to it is the UK pronunciation which bears a detectable degree of aspiration; the US pronunciation has little or none.

In the second place, I think you are confusing phonetic phenomena with phonemic phenomena.

  • Phonetic sounds are phones, the physical sounds which people actually produce. They cover an infinite range; every individual has their own characteristic pattern of sounds, and every one of their utterances is slightly different. Phonetic transcriptions are conventionally presented in square brackets, thus [t], [th], [d], [dh].
  • Phonemic ‘sounds’ are phonemes, linguistic abstractions; these represent the sounds which people intend and hear and recognize. By way of analogy: the letter A takes many different forms, but they are all recognized as A:
    Phonemic transcriptions are conventionally presented in slashes, thus /t/. This is the sort of transcription that shows up in dictionaries: a transcription which tells you not the actual sounds, but the meaningful sounds.

In English, aspiration is not phonemic. That is, aspiration does not consistently distinguish one English phoneme from another. Whether or not a consonant is aspirated is determined almost entirely by the sounds which surround it, its context, and that context is predictable. Consequently, aspiration does not add any meaning, which is why it is not marked in dictionary pronunciations.

However (and this is where it gets tricky), aspiration is called into play to assist in phonemic distinction in some contexts—specifically, in distinguishing syllable-initial voiced and voiceless stops.

This is because when you come right down to it, there is no such thing as a ‘voiced’ stop. A stop, by definition, completely interrupts the flow of air, and without air passing over the vocal chords there can be no voicing. With /t/ and /d/, for instance, which are articulated identically, there is no actual difference on the consonant itself. What we call a ‘voiced’ stop is in fact usually recognized mostly by ‘extra’ voicing before the air flow is stopped—typically, the preceding vowel is pronounced longer or even turned into a diphthong.

But this doesn’t work when there is no preceding vowel—at the beginning of a syllable. Since we cannot mark the voiced stop here, we mark the voiceless stop instead, with aspiration. Aspiration is not voiced (there is no hum on the air which is released), so the voicing on the vowel is delayed. The discernible difference between the aspiration which belongs to the consonant and the voice which belongs to the vowel marks the consonant as voiceless.

(But you should not take this to mean that native speakers ‘hear’ the aspiration. They don’t. What they ‘hear’ is /t/ rather than /d/.)

Why then are voiceless stops not aspirated after /s/ at the beginning of a syllable? Because they don’t have to be. The clusters /sb/, /sd/, /sg/ do not exist in English. If you hear /s/-plus-dental-stop it cannot be /sd/; it cannot be anything except /st/, so there is no need for aspiration to resolve an ambiguity.

As for your specific question: What people hear is phonemes, not phones. You may pronounce as [sɪk·st=iːn] or [sɪk·sthiːn] or [sɪk·sd=iːn] or even as [sɪg·zdhiːn] and nobody will notice. What they will hear is /sɪk·stiːn], because that’s all they know how to hear in that context, and that’s all that makes sense.

There are of course cases like iceberg and misdeed; but there we have adjacent sounds at the boundary between two distinct morphemes. It is only spelling convention which makes misdeed a single word.

