Vietnamese phonology

This article is a technical description of the sound system of the Vietnamese language, including phonetics and phonology.

Consonants
Two main varieties of Vietnamese, Hanoi and Ho Chi Minh City, are described below.

Hanoi
The 21 consonants of the Hanoian variety:


 * Thompson posits a glottal stop phoneme in a more abstract analysis of Hanoi Vietnamese that would eliminate the phonemes by involving sequences of glottal stop + consonant . Specifically, he proposes:
 * This analysis also simplifies the syllable description so that all syllables have obligatory onsets.
 * This analysis also simplifies the syllable description so that all syllables have obligatory onsets.
 * This analysis also simplifies the syllable description so that all syllables have obligatory onsets.
 * This analysis also simplifies the syllable description so that all syllables have obligatory onsets.
 * This analysis also simplifies the syllable description so that all syllables have obligatory onsets.
 * This analysis also simplifies the syllable description so that all syllables have obligatory onsets.
 * This analysis also simplifies the syllable description so that all syllables have obligatory onsets.


 * is labial-velar and always preceded by a consonant or glottal stop (though  does not occur before  in the southern varieties)
 * occurs syllable-initially only in borrowed vocabulary, and even so in a few words it is converted into (as in sâm banh, derived from French champagne); however, most Vietnamese people, especially those who are from the central and south proportion of the country, find difficulty in articulating it, and they usually replace it with .  in native Vietnamese words occurs only at the end of a syllable.
 * The glottalized stops are preglottalized and voiced:  (i.e., the glottis is always closed before the oral closure). This glottal closure is often not released before the release of the oral closure, resulting in the characteristic implosive pronunciation. However, sometimes the glottal closure is released prior to the oral release in which case the stops are pronounced . Therefore, the primary characteristic is preglottalization with implosion being secondary.
 * Among the coronals:
 * are dental:.
 * are alveolar:.
 * are apical (i.e. with the tongue tip).
 * are laminal (i.e. with the tongue blade).
 * Saigonese is not present
 * are phonetically palatoalveolar (i.e. the blade of the tongue makes contact behind the alveolar ridge).
 * is often slightly affricated, but it is unaspirated. (Note that the English affricate is aspirated and usually apical, unlike Vietnamese). This affrication, however, is not obligatory.
 * exists only in loan words.

Analysis of final ch, nh
The pronunciation of syllable-final ch and nh in Hanoi Vietnamese has had different analyses. One analysis, that of has them as being phonemes, where  contrasts with both syllable-final t  and c  and  contrasts with syllable-final n  and ng. Final is, then, identified with syllable-initial.

Another analysis has final ch and nh as representing predictable allophonic variants of the velar phonemes and  that occur after upper front vowels  (orthographic i) and  (orthographic ê).

Arguments for the second analysis include the limited distribution of final and, the gap in the distribution of  and , which do not occur after  and , and the patterning of ~ and ~ in certain reduplicated words. Additionally, final is not usually articulated as far forward as the initial :  and  are pre-velar. The preceding upper front vowels are co-articulated as well, resulting in centralized or relaxed variants:


 * {| cellspacing="5"


 * || → || ich || or
 * || → || inh || or
 * || → || êch ||
 * || → || ênh ||
 * }
 * || → || ênh ||
 * }
 * }

Finally, this analysis interprets orthographic ach and anh as having a vowel nucleus with a front component. One interpretation considers the orthographic a in these sequences as underlyingly a diphthong with a high front off-glide (thus equating it with orthographic ay) — in other words,  becomes  and  becomes. Another interpretation of the orthographic a is that it is underlyingly the vowel, which becomes phonetically open and diphthongized: → ,  →.

The first analysis closely follows the surface pronunciation of a slightly different Hanoi dialect than the second. In this dialect, the in  and  is not diphthongized but is actually articulated more forward, approaching a front vowel. This results in a three-way contrast between the rimes ăn vs. anh  vs. ăng. For this reason, a separate phonemic is posited.

Êch and ênh is the orthographic ê in these sequences as underlyingly a diphthong with a mid central off-glide (thus equating it with orthographic ây) — in other words,  becomes  and  becomes.

Phonological processes

 * A glottal stop is inserted before words that begin with a vowel or the glide :


 * {| cellpadding="5" style="line-height: 1.0em;"


 * ăn
 * 'to eat'
 * uỷ
 * 'to delegate'
 * }
 * uỷ
 * 'to delegate'
 * }
 * }
 * }
 * }
 * }


 * When stops occur at the end of words, they have no audible release due to accompanying glottal closure: :


 * {| cellpadding="5" style="line-height: 1.0em;"


 * đáp
 * 'to reply'
 * mát
 * 'cool'
 * khác
 * 'different'
 * }
 * 'cool'
 * khác
 * 'different'
 * }
 * khác
 * 'different'
 * }
 * }
 * }
 * }
 * }


 * When the velar consonants follow, they are articulated with a simultaneous bilabial closure  (i.e. doubly articulated) or are strongly labialized.


 * {| cellpadding="5" style="line-height: 1.0em;"


 * đục
 * 'muddy'
 * độc
 * 'poison'
 * ung
 * 'cancer'
 * ong
 * 'bee'
 * }
 * ung
 * 'cancer'
 * ong
 * 'bee'
 * }
 * ong
 * 'bee'
 * }
 * ong
 * 'bee'
 * }
 * }
 * }
 * }
 * }

Ho Chi Minh City (Saigon)
The 22 consonants of the Ho Chi Minh City (HCMC) dialect (Saigon dialect):

Phonetics
The Saigonese Vietnamese variety is essentially the same as the Hanoi with the following exceptions:


 * is generally pronounced in informal speech, but the speakers generally pronounce  when they read a text. It is always pronounced  in loan words (va li, ti vi etc.), even in informal speech. There is  that is also present among other speakers. These pronunciations are remnants of a merger and sound change involving  in southern speech (but  is always present in the northern and central regions).
 * Some speakers don't distinguish and.
 * Some speakers don't distinguish and.
 * Some speakers pronounce d as, and gi as , many speakers pronounce both as.
 * Hanoian is pronounced  in Saigonese.
 * Saigonese is generally slightly more palatalized than the Hanoian variety:.
 * In southern speech, the phoneme has a number of variant pronunciations that depend on the speaker. More than one pronunciation may even be found within a single speaker. It may occur as a retroflex fricative, an alveolar approximant , a flap , a trill , or a fricative flap/trill . This sound is generally represented in Vietnamese linguistics by the letter $⟨r⟩$.
 * Among the coronals:
 * can be dental:.
 * can be alveolar:.
 * can be apical:.
 * can be laminal:.
 * Unlike in Hanoian, the glide in Saigonese when at the beginning of a syllable is not preceded by a glottal stop, oanh is pronounced.

Regional consonant variation
In Hanoian Vietnamese, d, gi and r are all pronounced, while x and s are both pronounced. The table below summarizes these sound correspondences:


 * {| class="wikitable" style="text-align: center;"

! colspan="5" | Syllable onsets ! rowspan="2" | Hanoi ! rowspan="2" | Saigon ! colspan="3" | Example ! word ! Hanoi ! Saigon
 * rowspan=2 |
 * style="text-align: left;" | vợ  "wife"
 * rowspan=2 |
 * style="text-align: left;" | da  "skin"
 * style="text-align: left;" | ra  "to go out"
 * rowspan="2" |
 * style="text-align: left;" | chẻ  "split"
 * style="text-align: left;" | trẻ  "young"
 * rowspan="2" |
 * style="text-align: left;" | xinh  "beautiful"
 * style="text-align: left;" | sinh  "born"
 * }
 * style="text-align: left;" | ra  "to go out"
 * rowspan="2" |
 * style="text-align: left;" | chẻ  "split"
 * style="text-align: left;" | trẻ  "young"
 * rowspan="2" |
 * style="text-align: left;" | xinh  "beautiful"
 * style="text-align: left;" | sinh  "born"
 * }
 * style="text-align: left;" | trẻ  "young"
 * rowspan="2" |
 * style="text-align: left;" | xinh  "beautiful"
 * style="text-align: left;" | sinh  "born"
 * }
 * rowspan="2" |
 * style="text-align: left;" | xinh  "beautiful"
 * style="text-align: left;" | sinh  "born"
 * }
 * style="text-align: left;" | xinh  "beautiful"
 * style="text-align: left;" | sinh  "born"
 * }
 * style="text-align: left;" | sinh  "born"
 * }
 * style="text-align: left;" | sinh  "born"
 * }
 * }
 * }
 * }

There are also sound mergers involving syllable-final consonants among the different regional varieties. These correspondences differ from the initial consonant correspondences discussed above. In Saigonese Vietnamese, and  are not distinguished, they are both pronounced, except when the coronals occur after the higher front vocalics , in this case Saigon  remain the same as Hanoian. Additionally, -ch and -nh are pronounced [t, n] in Saigon:


 * {| class="wikitable" style="text-align: center;"

! colspan="5" | Syllable codas ! rowspan="2" | Hanoi ! rowspan="2" | Saigon ! colspan="3" | Example ! word ! Hanoi ! Saigon
 * style="text-align: left;" |
 * rowspan="2" |
 * style="text-align: left;" | mắt  "eye"
 * style="text-align: left;" |
 * style="text-align: left;" | mắc  "expensive"
 * style="text-align: left;" |
 * rowspan="2" |
 * style="text-align: left;" | răn  "warn"
 * style="text-align: left;" |
 * style="text-align: left;" | răng  "tooth"
 * style="text-align: left;" | after
 * rowspan="2" |
 * style="text-align: left;" | chết  "to die"
 * style="text-align: left;" | after
 * style="text-align: left;" | chếch  "askance"
 * style="text-align: left;" | after
 * rowspan="2" |
 * style="text-align: left;" | xin  "to ask"
 * style="text-align: left;" | after
 * style="text-align: left;" | xinh  "beautiful"
 * }
 * style="text-align: left;" | after
 * rowspan="2" |
 * style="text-align: left;" | chết  "to die"
 * style="text-align: left;" | after
 * style="text-align: left;" | chếch  "askance"
 * style="text-align: left;" | after
 * rowspan="2" |
 * style="text-align: left;" | xin  "to ask"
 * style="text-align: left;" | after
 * style="text-align: left;" | xinh  "beautiful"
 * }
 * style="text-align: left;" | after
 * rowspan="2" |
 * style="text-align: left;" | xin  "to ask"
 * style="text-align: left;" | after
 * style="text-align: left;" | xinh  "beautiful"
 * }
 * style="text-align: left;" | after
 * style="text-align: left;" | xinh  "beautiful"
 * }
 * style="text-align: left;" | xinh  "beautiful"
 * }
 * }
 * }

As can be seen above, vowels also vary among different regions.

Monophthongs

 * {| class="wikitable" style="width: 280px;" align="right"

! ! Front ! Central ! Back ! Close ! Close-mid ! Open-mid ‹â› ! Open ‹ă›
 * -align=center style="height: 65px"
 * ‹i, y›
 * ‹ư›
 * ‹u›
 * -align=center style="height: 65px;"
 * ‹ê›
 * ‹ô›
 * -align=center style="height: 65px;"
 * -align=center style="height: 65px;"
 * ‹e›
 * ‹ơ›
 * ‹o›
 * -align=center style="height: 65px;"
 * ‹a›
 * }
 * }
 * }

The IPA vowel chart of monophthongs (i.e., simple vowels) to the right is based on the sounds in Hanoi Vietnamese, (i.e., other regions of Vietnam may have different inventories).


 * All vowels are unrounded except for the three back rounded vowels:.
 * and are pronounced short — shorter than the other vowels.
 * and : The short ‹â› and long ‹ơ› may differ in both height and length, but the difference in length is most likely the primary distinction.
 * : Many descriptions, such as Thompson,, , consider this vowel to be close back unrounded: . However, Han's instrumental analysis indicates that it is more central than back. and  also transcribe this vowel as central.
 * The vowel becomes  before : lịch  →, chúc  → , thức  →  etc.
 * In Southern Vietnamese, the and  are centralized before  and : bên  →, xin  →  etc.
 * In Southern Vietnamese, the high and upper-mid vowels are diphthongized in open syllables: :


 * {| cellpadding="4" style="line-height: 1.0em;"


 * chị
 * 'elder sister'
 * quê
 * 'countryside'
 * tư
 * 'fourth'
 * mơ
 * 'to dream'
 * thu
 * 'autumn'
 * cô
 * 'paternal aunt'
 * }
 * 'fourth'
 * mơ
 * 'to dream'
 * thu
 * 'autumn'
 * cô
 * 'paternal aunt'
 * }
 * thu
 * 'autumn'
 * cô
 * 'paternal aunt'
 * }
 * cô
 * 'paternal aunt'
 * }
 * cô
 * 'paternal aunt'
 * }
 * }
 * }
 * }
 * }

Diphthongs and triphthongs
In addition to monophthongs, Vietnamese has many diphthongs and triphthongs. Most of these consist of a vowel followed by or. Below is a chart listing the diphthongs & triphthongs of general northern speech.


 * {| class="wikitable"

! align="center" | Diphthongs ! align="center" | Diphthongs/ Triphthongs ! align="center" | Diphthongs/ Triphthongs
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * }
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * }
 * align="center" |
 * align="center" |
 * align="center" |
 * align="center" |
 * }
 * align="center" |
 * align="center" |
 * }


 * never follows front vowels.
 * never follows rounded vowels.

Regional vowel variation

 * says that in Hanoi, words spelled with ưu and ươu are pronounced, respectively, whereas other dialects in the Tonkin delta pronounce them as and . Hanoi speakers that do pronounce these words with  and  are using only a spelling pronunciation.


 * also notes that in Hanoi the diphthongs, iê, ươ , uô , may be pronounced , respectively (as the spelling suggests), but before and in open syllables these are always pronounced.

Tone
Vietnamese vowels are all pronounced with an inherent tone. Tones differ in


 * pitch
 * length
 * contour melody
 * intensity
 * phonation (with or without accompanying constricted vocal cords)

Unlike many Native American, African, and Chinese languages, Vietnamese tones do not rely solely on pitch contour. Vietnamese often uses instead a register complex (which is a combination of phonation type, pitch, length, vowel quality, etc.). So perhaps a better description would be that Vietnamese is a register language and not a "pure" tonal language.

In Vietnamese orthography, tone is indicated by diacritics written above or below the vowel.

Six-tone analysis
There is much variation among speakers concerning how tone is realized phonetically. There are differences between varieties of Vietnamese spoken in the major geographic areas (i.e. northern, central, southern) and smaller differences within the major areas (e.g. Hanoi vs. other northern varieties). In addition, there seems to be variation among individuals. More research is needed to determine the remaining details of tone realization and the variation among speakers.

Northern varieties
The six tones in the Hanoi and other northern varieties are:


 * {| class="wikitable" style="text-align:center;"

! Tone name ! Tone ID ! Description ! Chao Tone Contour ! Diacritic ! Example
 * ngang "level"
 * A1
 * mid level
 * (33)
 * align="center" | (no mark)
 * ba 'three'
 * huyền "hanging"
 * A2
 * low falling (breathy)
 * (21) or (31)
 * align="center" | `
 * bà 'lady'
 * sắc "sharp"
 * B1
 * mid rising, tense
 * (35)
 * align="center" | ´
 * bá 'governor'
 * nặng "heavy"
 * B2
 * mid falling, glottalized, short
 * (3ˀ2ʔ) or (3ˀ1ʔ)
 * align="center" |  ̣
 * bạ 'at random'
 * hỏi "asking"
 * C1
 * mid falling(-rising), harsh
 * (313) or (323) or (31)
 * align="center" |  ̉
 * bả 'poison'
 * ngã "tumbling"
 * C2
 * mid rising, glottalized
 * (3ˀ5) or (4ˀ5)
 * align="center" | ˜
 * bã 'residue'
 * }
 * C2
 * mid rising, glottalized
 * (3ˀ5) or (4ˀ5)
 * align="center" | ˜
 * bã 'residue'
 * }



Ngang tone:


 * The ngang tone is level at around the mid level (33) and is produced with modal voice phonation (i.e. with "normal" phonation). Alexandre de Rhodes (1651) describes this as "level"; describes it as "high (or mid) level".

Huyền tone:


 * The huyền tone starts low-mid and falls (21). Some Hanoi speakers start at a somewhat higher point (31). It is sometimes accompanied by breathy voice (or lax) phonation in some speakers, but this is lacking in other speakers: bà = or . Alexandre de Rhodes (1651) describes this as "grave-lowering";  describes it as "low falling".

Hỏi tone:


 * The hỏi tone starts a mid level and falls. It starts with modal voice phonation, which moves increasingly toward tense voice with accompanying harsh voice (although the harsh voice seems to vary according to speaker). In Hanoi, the tone is mid falling (31). In other northern speakers, the tone is mid falling and then rises back to the mid level (313 or 323). This characteristic gives this tone its traditional description as "dipping". However, the falling-rising contour is most obvious in citation forms or when syllable-final; in other positions and when in fast speech, the rising contour is negligible. The hỏi also is relatively short compared with the other tones, but not as short as the nặng tone. Alexandre de Rhodes (1651) describes this as "smooth-rising"; describes it as "dipping-rising".

Ngã tone:


 * The ngã tone is mid rising (35). Many speakers begin the vowel with modal voice, followed by strong creaky voice starting toward the middle of the vowel, which is then lessening as the end of the syllable is approached. Some speakers with more dramatic glottalization have a glottal stop closure in the middle of the vowel (i.e. as ). In Hanoi Vietnamese, the tone starts at a higher pitch (45) than other northern speakers. Alexandre de Rhodes (1651) describes this as "chesty-raised"; describes it as "creaking-rising".

Sắc tone:


 * The sắc tone starts as mid and then rises (35) in much the same way as the ngã tone. It is accompanied by tense voice phonation throughout the duration of the vowel. In some Hanoi speakers, the ngã tone is noticeably higher than the sắc tone, for example: sắc = (34); ngã =  (45). Alexandre de Rhodes (1651) describes this as "acute-angry";  describes it as "high (or mid) rising".

Nặng tone:


 * The nặng tone starts mid or low-mid and rapidly falls in pitch (32 or 21). It starts with tense voice that becomes increasing tense until the vowel ends in a glottal stop closure. This tone is noticeably shorter than the other tones. Alexandre de Rhodes (1651) describes this as "chesty-heavy"; describes it as "constricted".

Southern varieties
The Southern variety is similar through all tones, but it's only the nặng tone is different, the nặng tone is pronounced [˨˧]. Many of those speaking Southern dialects will omit using the ngã tone and replace it with the hỏi tone.

North-central and Central varieties
North-central and Central Vietnamese varieties are fairly similar with respect to tone although within the North-central dialect region there is considerable internal variation.

It is sometimes said (by people from other provinces) that people from Nghệ An pronounce every tone as a nặng tone.

Eight-tone analysis
An older analysis assumes eight tones rather than six. This follows the lead of traditional Chinese phonology. In Middle Chinese, normal syllables allowed for three tonal distinctions, but syllables ending with, or  had no tonal distinctions. Rather, they were consistently pronounced with in a short high tone, which was called the entering tone and considered a fourth tone. Similar considerations lead to the identification of two additional tones in Vietnamese for syllables ending in, , or. These are not phonemically distinct, however, and hence not considered as separate tones by modern linguists.

Syllables and phonotactics
According to, there are 4,500 to 4,800 possible spoken syllables (depending on dialect), and the standard national orthography (Quốc Ngữ) can represent 6,200 syllables (Quốc Ngữ orthography represents more phonemic distinctions than are made by any one dialect).

The Vietnamese syllable structure follows the scheme:


 * (C1)(w)V(G|C2)+T

where
 * {| cellpadding="7" cellspacing="0" style="background: #f9f9f9;"


 * valign="top" |
 * C1 = initial consonant onset
 * w = bilabial on-glide
 * V = vowel nucleus
 * valign="top" |
 * G = off-glide coda ( or )
 * C2 = final consonant coda
 * T = tone.
 * }

In other words, a syllable can optionally have one onset consisting of single consonant or a consonant and the glide and an optional coda. The vowel nucleus may have an additional glide element.

More explicitly, the syllable types are as follows:


 * {| class=wikitable

! Syllable ! Example ! Syllable ! Example
 * - style="font-size: 85%; background: #f2f2f2;"
 * V
 * wV
 * VC
 * wVC
 * VC
 * wVC
 * CV
 * CwV
 * CVC
 * CwVC
 * CVC
 * CwVC
 * }
 * wVC
 * CV
 * CwV
 * CVC
 * CwVC
 * CVC
 * CwVC
 * }
 * CVC
 * CwVC
 * CVC
 * CwVC
 * }
 * CVC
 * CwVC
 * }
 * }
 * }

C1:

Any consonant may occur in as an onset with the following exceptions:
 * does not occur in native Vietnamese words
 * does not occur in Hanoian, but it does occur in Saigonese and other varieties (due to sound change)

w:


 * does not occur after labial consonants
 * does not occur after in native Vietnamese words (it occurs in uncommon Sino-Vietnamese borrowings)
 * the sequences appears in Saigonese as, excepting spelling pronunciations

V:

The vowel nucleus V may be any of the following 14 monophthongs or diphthongs:.

G: The offglide may be or. Together, V and G must form one of the diphthongs or triphthongs listed in the section on Vowels. The offglide cannot be if the syllable contains a  onglide, except for case of 'khuỷu (tay)' (elbow).

C2:

The optional coda C2 is restricted to labial, coronal, & velar stops and nasals.

T:

Syllables are spoken with an inherent tone contour. All tone contours are possible for open syllables (syllables without consonant codas) and closed syllables with nasal codas. If the syllable is closed with labial, coronal, or velar stops /p, t, k/, only 2 contours are possible, that is the sắc and the nặng tone.