Kashu-do (歌手道): STRUCTURE OF THE VOCAL FOLDS: A three-dimensional view

Fig. 1. This two dimensional view shows the “depth” of the vocal folds on the Y-axis (vertical) and the the “width” on the X-axis (horizontal)  we are not seeing the “length” of the folds, which would be parallel to the ground.  
I chose to begin with this view so that we are aware of “what we are after” as singers.  The curlycue blue arrow shows the airway (the path of the breath stream).
Our first issue is the interaction of the breath stream with the vocal folds.  The epithelium and the superficial lamina propria (Reinke’s Space) together are referred to as the “Fold Cover”, the other layers getting progressively harder (stiffer: less flexible) are called the “Body”.  We would like for the vibration of the vocal folds during singing to be isolated on the fold cover (yellow and blue).  Isolating the fold cover gives the voice a sensation of flexibility: the type of sensation we identify as heady, fluid and tension-free!  I call this the Flag and Flagpole Effect.  For a flag to flutter freely in the wind, it needs the structure of the firm flagpole to steady it.
Fig. 2A.  This animated gif simulates the Flag-flagpole Effect.  The body of the fold the muscular red portion and the medial yellow portion are still, while the outside layer, the cover, oscillates with the movement of the breath (not seen).
Fig. 2B. This second animation shows a flexible body that allows the entire mass of the folds to participate in the vibration.  If the entire mass of the folds is active in the vibration, the sensation is one of tension and inflexibility.  Greater breath pressure is needed to maintain the vibration.  It is far greater work and it tires the voice.  The question remains how do we get the body of the folds to be firm enough such that the breath stream activates the cover only?
Dr. Zhang, Zhaoyang at UCLA, in a 2008 article shows that a stiffer fold body will isolate the vibration of the vocal folds along the mucosal edge. These two images from his article illustrate the different modes of vibration:
Fig. 3A. This first picture represents a model of loose fold body and cover.  When there is not enough antagonism between the Thyro-arytenoid group and crico-thyroid, the body (represented by the leftmost blue structure) vibrate with the cover (rightmost blue structure. The red structures represent the same two structures at rest in order to show relative movement).  The vibration will tend to be more difficult in such a case. More sub-glottal pressure will be necessary to start and maintain the vibration.  (This animated gif and the following one represent the left fold).

Fig. 3B. This simulation, on the other hand, represents a stiff fold body (leftmost blue structure is still) rendered by contractions of both main muscle groups.  This antagonism makes the fold body less mobile and isolates the vibration of the folds along the mucosal edge (cover, represented by mobile rightmost blue structure).  The antagonism between Thyro-arytenoid and crico-thyroid also increase the contact area along the mucosal edge.

Fig. 4. This is a similar two-dimensional view of the folds but more anatomically complete.  On the right side of the picture, I draw a red line from the top of the epiglottis around the vestibular fold (false vocal fold) around the true fold and down to the trachea.  This layer of tissue is one fold that covers the entire structure.  That is why we refer to the vocal cords as folds.  That tissue when it comes to the true vocal folds form the outer layer (epithelium).  That layer covers the rest of the components of the entire vocal fold structure.

Fig 4B. Let us concentrate our attention on the two lateral muscles on each side, the Thyroarytenoid (also called External or Thyromucularis) and the Vocalis muscle (sometimes called Internal or Thyrovocalis)! When these two muscles contract, they contract in opposite directions.  When it contracts, the Vocalis thickens the vocal folds vertically (gives it more depth—See the first pictures to have a perspective—greater contact area: more on contact area later).  The Muscularis contract the folds in the opposite direction also helping create greater vertical mass. (The Muscularis also contracts, slightly inward, which appears to have a secondary closure function).  When the two muscles are active together in opposition to the Cricothyroid (see below), they create a dynamic that renders them stiffer and less flexible.  When the muscles are stiff (and by proximity, the medial layers as well), the outer layers (the cover) alone respond to the movement of the breath stream.  This allows the tone to feel more fluid, less resistant to the airstream. What we identify as the head voice sensation.  The action of the Muscularis is at least equivocal. Although its action shortens/thickens the vocal folds, the fact that the CT is pulling on the Thyroid cartilage on the same vector as the Muscularis makes the exact nature of its contraction difficult to gauge.  Even though the Muscularis is a thickening partner to Vocalis, most articles and books give the task of vocal fold thickening to the Vocalis.  More interesting is the slightly inward angle of the contraction which contributes as a secondary adductor.
Given an appropriate dynamic between the two intrinsic muscles of the the vocal folds, pitch is pretty much controlled by the contraction or relaxation of the Cricothyroid Muscles.  They are included in the above picture but not labeled.
Fig. 4C. This picture, very similar to the one above it, has pointers to the cricothyroid muscle.  Let us have an outside view:
Fig. 4D. Looking from the outside we can see that when the cricothyroid contracts (contracts toward its point of origin the cricoid cartilage) it would pull the thyroid cartilage downward.  Looking at the picture directly above (inside view), it is clear that the vocal folds are attached to the Thyroid cartilage on the inside.  When the thyroid cartilage is tilted downward, the vocal folds stretch and are set up for faster oscillation (vibration) and therefore higher frequency (pitch).  We will discuss the mechanics of vocal fold oscillation later and how “shallower” folds (less deep) make for faster oscillations and higher pitch.  For now let us consider the dynamic antagonism between the three muscle pairs (one set for each fold): Thyro-muscularis, Thyro-vocalis and Crico-thyroid.

 The Thyro-arytenoid (TA) pairs (vocalis and muscularis) must relax to allow the stretching/thinning of the vocal folds that makes for faster oscillations and higher frequencies.  However, the lengthening of the folds help maintain the stiffness of the fold body as long as the TA group continues to be active and does not give in completely to the contraction of the Crico-thyroid (CT).  Ideally, the thickness (depth) of the folds change with pitch, but the stiffness of the body is maintained as long as there is enough antagonism (opposing action) between the three muscle groups.
The question in the singers mind will always be:
The singer’s experience is basically sensory.  Indeed we can only activate these muscles by having experienced certain sensations associated with their function.  Two basic sensations are Stretch and Substance.  When I sing a relaxed high note softly (a sensation akin to falsetto or flute-voice), we have a feeling that we can keep going up without problem, as if we can continually stretch. In fact there is a sensation of lengthening.  A sensation of substance, meatiness, full-bodyness is experienced when I sing a relaxed low note.  In both situations, the experience is one-sided.  If I increase pressure, the stretch-dominant note tends to go toward falsetto (over-stretch that cannot endure the increased breath pressure, causing the back of the folds to open–more on fold closure later).  Increasing volume on the  substance dominant note is easier, because the vertical fold mass is enough to endure the increased pressure. However, it becomes progressively difficult to rise in frequency because the set-up is one-sided in favor of the Vocalis muscle that governs thickness.  
In order to effectuate a gradual crescendo without loss of balanced coordination, both sensations must be engaged before increasing the tone thus (I chose to begin on Db4, right on the muscular balancing point of the tenor voice):

The next two videos show a soprano dealing with balance on both sides of the issue:  the tendency to lose stretch on the way down and to lose substance on the way up.  (Of special note is that the soprano has an excellent F2 dominance in her middle range–This will come up when we discuss resonance).

The state of the tone before the crescendo (we will soon bring breath pressure and fold closure into consideration) is interesting.  It is falsetto or a soft head-tone?  By our definition above, head-voice, is essentially proper muscular coordination including, appropriate balance between substance and stretch.  If the TA group is appropriately balanced throughout the changes in CT contraction and relaxation (pitch), the vibration will be isolated on the cover and the tone will feel heady and released.  However, we take for granted that fold closure and breath pressure play appropriate roles.


What happens if the two folds close too hard against each other? The fold cover would be pressed against the body (muscular layer) and would not be free to oscillate.  In such a case the vibration would have to include the entire fold structure (including the body).  The amount of breath pressure would have to be very high to maintain the vibration of greater horizontal mass, including a relatively still body.  This is why pressed phonation does not work and why it is not a remedy for breathy phonation.  Breathy phonation often occurs when the vertical mass (induced by the contraction of the Vocalis) is inadequate or the muscles responsible for bringing the folds to midline are not working adequately.
This is were I theorize (only because this has not been observed with scientific protocol yet):  I believe that when the vertical phase is too shallow, it takes the shape of a higher frequency (pitch) then is desired.  Therefore, the tendency is to sing sharp (higher frequency).  To compensate, the folds press together to slow down the opening phase of the vibration.  In this way the intended frequency is achieved, however the tone is pressed and the vibration includes the body. A singer who does not like the tension that comes with this pressed mode of singing might reduce the breath pressure by allowing the arytenoidal juncture (the back of the folds) to open creating a gap that allow the air to pass through.  There are many who use this pressed mode of phonation with leakage through the arytenoidal gap, without knowing they are doing it.  It can be done subtly or not so subtly.  When it is a minor compensation it does not sound badly and it can be difficult to convince a successful singer to change.  I have observed this strategy in many high voices, particularly coloraturas and Rossini tenors.  This approach is also common among singers of early music.
While we are on the subject of fold depth and closure, it is worthwhile here to mention that a recent paper (Journal of Voice July 2014) by Harry Hollien (University of Florida): Vocal Fold Dynamics for Frequency Change, confirmed that fold mass is basically the same for a given fundamental frequency (pitch) regardless of who the singer is.  This means that fold thickness for a coloratura or a bass is basically the same when they are singing the same pitch.  What is different is the relative longitudinal tension (tautnessof the vocal folds on the given pitch.  A coloratura singing C4 (middle C or C1 in the European system) is in her lower range and thus has relatively relaxed folds whereas a bass singing the same note is in his upper range and has much more longitudinal tension:

What was not expected was the relatively high correlation between vocal fold thickness and absolute fundamental frequency of phonation…As can be seen, the thickness of the folds appears to be reasonably similar at each fundamental frequency no matter if the subject was male or female or had a high-pitched or low-pitched voice.  Thus, it appears that the per-unit mass of the folds relates to the frequency produced no matter how massive (or not) these structures are naturally. (Hollien p. 400)

Hollien later explains the correlation between thickness of the folds, variations in length and overall mass.  Although the bass folds must lengthen considerably to achieve C4, in the end, the vibrating mass is the same as with the coloratura who does not have to lengthen very much to sing the same pitch.  This very complex experiment at least tells us there is an optimal fold depth and length index for a given pitch produced by a given voice.  If that depth/length relationship (Stretch and Substance) is not achieved, there must be compensatory measures (usually pressing and raised breath pressure).

At this juncture, we can pedagogically conclude that during phonation of a given pitch, there must be a balance between fold thickness (depth) and lengthening that adheres to a gentle closure of the folds such that the fold cover is not trapped.   The next area of concern is therefore how the muscles that govern fold closure (Lateral Crico-Arytenoids and Inter-arytenoids) respond to increased breath pressure.

Fig. 5A. The right Posterior Cricoarytenoid (PCA) muscle is removed in this picture to feature a clear view of the Lateral Cricoarytenoid (LCA).

Fig. 5B.  The rendering above takes all obstructive tissue away so we can see how the LCA attach to the muscular process of the arytenoid.  The black dot represents a swivel point.  When the muscle contract in the direction of the Cricoid (unseen here–muscle contract in the direction of the point of origin.  They are names by point of origin and then point of insertion. The CA is so named because it originates at the Cricoid and inserts into the Arytenoid. Thus Crico-Arytenoid), the arytenoid swivels bringing the vocal processes (where the vocal folds insert) inward and closing the glottis.  It should be noticed that the swiveling of the arytenoids inward also creates a gap in the back.  The arytenoids also have the ability to rock inward where the gap is.  This is controlled by the Inter-Arytenoids (IA)
Fig. 6A. The picture above shows both sets of Interarytenoids (IA): transverse and oblique.  The transverse go across parallel between the arytenoids.  When they contract they bring the arytenoids closer together and close the gap.  The obliques do the same but draw the arytenoids in diagonally. Both actions are necessary to completely close the posterior gap.

Fig. 6B. This picture gives a clearer view of the arytenoids and shows more clearly the layers of muscles.

Are these muscles strong enough to maintain gentle closure even when breath pressure increases for volume.  In other words loudness has the potential of disrupting balance if one of the muscles can not maintain its proper function when pressure is applied.  There must be a means of strengthening these muscles in balance (we shall discuss the logic behind occlusives later).

Just to be thorough, I must mention the Posterior Crico-arytenoid (PCA).  It is responsible for abducting (draw apart) the folds.  Muscular activity has been observed in the PCA during phonation, which would be unexpected.  I have a couple of theories on that.  Since all muscles are paired, it is possible that when the adductors (Lateral CA) are dominant (as expected during phonation) that the abductors (Posterior CA) provide counterbalance. It is also possible by the vector of their contraction that PCA counters the vector of the Crico-thyroid, that stretch the folds for pitch.

Finally I must address the secondary adductive function of the Thyromuscularis (external TA).  I mentioned above that this muscle contracts slightly inward and since its vector is more or less the same as the CT, when the folds are elongated, they tend to come together a little more.  This secondary adduction must be taken into account.  Sometimes inefficiency occurs not because the IAs or the LCAs are functioning inadequately but rather because the folds are not lengthened enough for the desired pitch.  There can be many variations on how a given fundamental frequency is obtained. It is theoretically possible that the vibratory cycle occur without the top of the folds closing.  This mode of vibration  would be possible for folds that are too deep (TA-hyperfunction).  This is the second sound I demonstrated on the first clip.

I will stop here for now.  We will continue soon with breath, resonance, etc…

© 07/08/2015