The Singer's Formant: What It Is and How to See It in Your Voice

June 14, 202614 min readJorge C. Lucero

🎯 Key Takeaways

The singer's formant is a peak of acoustic energy near 3 kHz—the physical basis of the "ring" (squillo) that lets a trained voice carry over an orchestra without amplification
It is a resonance phenomenon, not a louder voice source—formed when the third, fourth, and fifth formants (F3, F4, F5) cluster into a single reinforced band (Sundberg, 1974)
It works because of a gap in the orchestra's spectrum—orchestral energy peaks near 500 Hz and rolls off steeply, leaving the ~3 kHz region relatively free for the voice to occupy
Center frequency varies by voice type—roughly 2.4 kHz for basses up to ~3.0 kHz for higher voices; high sopranos rely on a different strategy (formant tuning) rather than clustering
It can be quantified—the Singing Power Ratio (SPR), read from a long-term average spectrum, is the most common objective index of singer's-formant prominence

A trained opera singer can be heard clearly over an eighty-piece orchestra, with no microphone, from the back row of a large hall. The orchestra produces vastly more acoustic power than a single human voice—so how does the voice get through? The answer is not that the singer is simply louder. It is that the trained voice concentrates energy in a narrow frequency region where the orchestra is weak. That region, and the spectral peak that sits in it, is called the singer's formant.

For voice teachers, the singer's formant is the measurable counterpart of qualities they describe in perceptual language: ring, squillo, presence, ping, projection. For voice scientists, it is one of the best-studied acoustic markers of trained singing. This guide explains what it is, why it works, how it differs across voice types, and how it can be observed and measured. The information here draws on published voice-science research.

One framing note before we begin. This is an article about acoustics—about what the signal shows and how to read it. It is not a method for producing the singer's formant, and it makes no claim about how any individual should train. Measurement and pedagogy are separate disciplines; this guide stays on the measurement side.

What Is the Singer's Formant?

Recall that formants are the resonant frequencies of the vocal tract—the bands of frequency that the throat and mouth selectively amplify. The first two formants (F1 and F2) determine which vowel we hear. The higher formants (F3, F4, F5) sit above the vowel-defining region and contribute to timbre and speaker identity.

In ordinary speech, F3, F4, and F5 are spread out and each is relatively modest in amplitude. In trained classical singing, something different happens: these three higher formants move close together in frequency and reinforce one another, producing a single prominent peak of acoustic energy near 3 kHz. Johan Sundberg, whose work at KTH in Stockholm established the modern understanding of this effect, named it the singer's formant and—reflecting that it is really several formants bunched together—later described it as the singer's formant cluster.

Three Ideas That Define the Singer's Formant

1A peak near 3 kHz

The singer's formant appears as a single broad prominence in the spectrum, typically centered somewhere between about 2.4 and 3.4 kHz depending on the voice. Its center frequency is largely stable across vowels and across a singer's range.

2It is resonance, not effort

The peak is created by the shape of the vocal tract—a filter effect—not by the vocal folds working harder. The same laryngeal sound source, shaped by a different vocal-tract configuration, yields much more energy in the 3 kHz region.

3It is a cluster of F3, F4, and F5

The prominence is not a new, separate resonance but the result of the third, fourth, and fifth formants converging and reinforcing each other into one broad band rather than three smaller, separate peaks.

"Ring" and "squillo" are perceptual; the singer's formant is acoustic

Voice teachers have long used words like ring, squillo, ping, and presence to describe a bright, carrying quality. The singer's formant is the acoustic correlate of those perceptions—the thing in the signal that researchers have repeatedly linked to listener ratings of "ringing" quality. Perceptual terms describe the experience; the singer's formant describes the measurement.

Why It Lets a Voice Carry Over an Orchestra

The power of the singer's formant comes from where it sits, not just how strong it is. The combined sound of an orchestra has most of its energy at low frequencies—it peaks near 500 Hz and then falls off steeply toward higher frequencies. A voice trying to compete down at 500 Hz is fighting the orchestra at its strongest. But around 3 kHz, the orchestra has comparatively little energy left.

By concentrating energy in that ~3 kHz window, the trained voice places its sound precisely in the gap where the orchestra is quiet. The ear, which is also highly sensitive in this frequency range, picks the voice out easily. This is why a singer with a well-developed singer's formant can be heard over a large ensemble without ever being "louder" than it in total acoustic power.

A useful analogy

Imagine a crowded room where everyone is talking in low, murmuring tones. A whistle cuts through instantly—not because it is louder than the whole room, but because its energy sits in a frequency band the murmur does not occupy. The singer's formant works on the same principle: it finds the open lane in the spectrum.

The Articulatory Basis

What vocal-tract configuration produces the clustering? Sundberg's (1974) classic acoustic model proposed that the key is the relationship between the larynx tube (the small space just above the vocal folds, bounded by the epiglottis and arytenoid structures) and the pharynx above it. When the pharynx is wide relative to the outlet of the larynx tube—Sundberg estimated a ratio of roughly 6 to 1—the larynx tube begins to act as a separate little resonator, adding a resonance near 2.8 kHz and pulling the higher formants together.

Later imaging research has refined this picture. Studies using MRI of trained singers have not always confirmed the exact 6:1 geometric ratio, and some point to expansion of the lower pharynx as the more consistent anatomical correlate of the trained configuration. The broad principle—that a particular shaping of the lower vocal tract clusters the upper formants—remains well supported, even as the precise anatomy continues to be studied.

Measurement does not require knowing the anatomy

The ongoing refinement of the articulatory model is a research question. Observing the singer's formant in a recording is not—the spectral peak is visible regardless of which anatomical account best explains it. This guide focuses on what can be seen and measured in the acoustic signal.

Center Frequency Varies by Voice Type

The singer's formant does not sit at exactly the same frequency for everyone. Lower voices, with longer vocal tracts, cluster their higher formants at somewhat lower frequencies; higher voice types cluster them a little higher. The following approximate center frequencies are widely cited from Sundberg's work, as summarized in the voice-science literature:

Voice type	Approx. center frequency
Bass	~2,400 Hz
Baritone	~2,550 Hz
Tenor	~2,840 Hz
Alto / contralto	~3,000 Hz

Approximate values, after Sundberg (2001) as summarized in Johnson & Kempster (2011). Individual singers vary; these are central tendencies, not targets.

The Soprano Exception

High sopranos are an important special case. At high pitches, the spacing between harmonics becomes very wide, and research (e.g., Weiss, Brown & Morris, 2001) indicates that high soprano voices do not show the same narrow, clustered singer's formant typical of tenors. Instead, high female voices tend to rely on formant tuning—adjusting a lower resonance (often the first formant) to line up with a strong harmonic of the sung pitch—to gain output and projection. In other words, "carrying power" is achieved by more than one acoustic strategy, and the classic singer's formant is mainly a feature of lower and male voice types.

How the Singer's Formant Is Measured

Because the singer's formant is a feature of the overall spectral balance of a voice, the standard way to observe it is the long-term average spectrum (LTAS)—a plot of average energy against frequency, computed over a stretch of phonation rather than a single instant. On an LTAS of a trained voice with a strong singer's formant, a distinct hump appears in the 2–4 kHz region. On an untrained voice, or in ordinary speech, that region is comparatively flat.

The Singing Power Ratio (SPR)

The most widely used numerical index of singer's-formant prominence is the Singing Power Ratio (SPR), introduced by Omori and colleagues (1996). Conceptually it is simple:

SPR = (peak energy in the 2–4 kHz band) − (peak energy in the 0–2 kHz band)

It compares the strongest spectral peak in the singer's-formant region against the strongest peak in the lower region. A higher (less negative) SPR means relatively more energy in the singer's-formant band—more "ring." Research has shown SPR distinguishes trained singers from untrained voices and correlates with listener ratings of ringing quality, which is why it is often proposed as an objective way to monitor change over time.

A closely related family of measures—such as the energy ratio and short-term energy ratio—works on the same idea of comparing energy above and below roughly 2 kHz. The details differ, but all of them ask the same underlying question: how much of this voice's energy is concentrated in the high band where ring lives?

Measurement caveats worth knowing

Recording quality matters most. The singer's formant lives in the 2–4 kHz region, which is sensitive to microphone response, room acoustics, distance, and compression. A consistent recording setup is essential before comparing values across sessions.
SPR is relative, not absolute. Because it is a ratio of peaks, it is best used to compare the same voice under consistent conditions—not to rank different singers recorded in different settings.
Vowel and loudness affect the value. The prominence of the singer's formant depends on the vowel and on vocal loudness, so comparisons should hold these as constant as possible (a sustained /a/ at a comparable level is a common choice).
It is not a quality score. A particular SPR value is not "good" or "bad" in the abstract. It describes a spectral feature; its meaning depends entirely on the singer, the genre, and the goal.

Seeing It in Your Own Voice

The singer's formant is one of the most rewarding acoustic features to look at directly, because the difference between a "ringing" production and a dull one is often plainly visible in the spectrum. Two views are particularly useful:

On a spectrogram

A sustained vowel with a strong singer's formant shows a band of darkened (high-energy) harmonics in the region around 2.5–3.5 kHz. Comparing two productions of the same vowel—one bright, one "held back"—the high-frequency band visibly intensifies in the brighter one.

On a spectrum / LTAS view

Averaging over a sustained note, a hump in the 2–4 kHz region is the signature of the singer's formant. The height of that hump relative to the low-frequency energy is, in essence, what the SPR quantifies.

For teachers, the value of looking at this directly is that it turns an abstract perceptual goal into something visible and comparable across time. A recording from the start of a term and one from the end, captured under the same conditions, make change in spectral balance concrete in a way that words alone cannot.

Common Questions

Q: Is the singer's formant only for opera and classical singing?

It is most strongly associated with Western classical and operatic singing, where unamplified projection over an orchestra is the demand it evolved to meet. A comparable high-frequency energy concentration, sometimes called the "speaker's ring," has also been studied in trained speaking voices, and related strategies appear in other styles. With modern amplification, the acoustic need is different in popular genres, but the phenomenon itself is general to how the vocal tract can shape high-frequency energy.

Q: Is a higher SPR always better?

No. SPR describes how much energy sits in the singer's-formant band relative to the lower band. Whether more is desirable depends entirely on the voice, the repertoire, and the artistic goal. SPR is a descriptive measurement, not a verdict on quality.

Q: Why don't I see a clear singer's formant in a high soprano recording?

That is expected. Research indicates that high female voices generally do not produce the narrow, clustered singer's formant typical of tenors; they tend to use formant tuning instead. The absence of a classic cluster peak in a high soprano is consistent with the literature, not a sign of a problem.

Q: Can I measure this from a phone recording?

With care. The singer's formant sits in a frequency region affected by microphone characteristics and room acoustics, so the absolute numbers should be treated cautiously. If the recording setup is kept consistent—same device, distance, and room—comparisons of the same voice across sessions are more meaningful than comparisons across different setups.

Summary

1The singer's formant is a spectral peak near 3 kHz—the acoustic basis of vocal "ring" and unamplified projection
2It is a resonance effect—F3, F4, and F5 cluster into one reinforced band rather than the voice source becoming louder
3It exploits a gap in the orchestra's spectrum—energy near 3 kHz sits where competing instruments are weak
4Center frequency varies by voice type, and high sopranos use formant tuning instead of the classic cluster
5The Singing Power Ratio (SPR), read from an LTAS, is the most common objective index—best used to track the same voice under consistent recording conditions

📊 See Your Own Singer's Formant with PhonaLab

PhonaLab's Spectrogram and Spectral Analysis tools let you view the 2–4 kHz region of a sustained vowel directly in your browser—no installation required—so you can see where your voice concentrates its high-frequency energy. The Pitch Visualizer adds formant tracking alongside F0.

Try Spectral Analysis →

Free with a PhonaLab account • Browser-based spectrogram and spectral views

⚠️ Educational Information

This article presents voice-science concepts and summarizes published research findings for educational purposes. It is not vocal-pedagogy instruction, clinical advice, or a method for producing any particular vocal quality. Decisions about voice training should be made with a qualified teacher, and decisions about vocal health with a qualified, licensed healthcare professional. PhonaLab provides acoustic measurement tools; it does not provide pedagogical or clinical interpretations.

References & Further Reading

Sundberg J. (1974). Articulatory interpretation of the "singing formant." Journal of the Acoustical Society of America, 55(4), 838–844.
Sundberg J. (2001). Level and center frequency of the singer's formant. Journal of Voice, 15(2), 176–186.
Omori K, Kacker A, Carroll LM, Riley WD, Blaugrund SM. (1996). Singing power ratio: Quantitative evaluation of singing voice quality. Journal of Voice, 10(3), 228–235.
Weiss R, Brown WS, Morris J. (2001). Singer's formant in sopranos: Fact or fiction? Journal of Voice, 15(4), 457–468.
Bartholomew WT. (1934). A physical definition of "good voice-quality" in the male voice. Journal of the Acoustical Society of America, 5(3_Supplement), 224.
Titze IR. (2000). Principles of Voice Production. National Center for Voice and Speech.
Sundberg J. (1987). The Science of the Singing Voice. Northern Illinois University Press.
Johnson AM, Kempster GB. (2011). Classification of the classical male singing voice using long-term average spectrum. Journal of Voice, 25(5), 538–543.

Back to Guides