Introduction
The new TOEFL Speaking test doesn’t begin with an opinion/paired choice question anymore. In fact all 4 "classic" tasks that have defined the TOEFL Speaking genre since 2019? They are gone.
For the first time, ETS has separated how you sound from what you say.
Clarity now drives fluency. Rhythm now defines delivery.
And together, they anchor every score you’ll earn on the new test.
To see the complete scoring logic and construct breakdown, check out the latest draft of my new untitled TOEFL Speaking 2026 book.
The New Structure of TOEFL Speaking 2026
The new test has just two tasks, both scored by AI:
- Listen & Repeat – tests pronunciation, timing, and intelligibility.
- Take an Interview – tests spontaneous reasoning and coherence.
Each task generates measurable data for the same four constructs:
Delivery, Intelligibility, Language Use, and Topic Development.
Task 1: Listen & Repeat
This task measures sound control under time pressure—how well you can reproduce spoken English with accurate pronunciation and stable rhythm.
You’ll hear seven short prompts and repeat each one aloud.
The system records your speech immediately.
The time distribution (8, 8, 10, 10, 10, 12, 12) is deliberate.
Each step increases linguistic load—more words, more rhythm shifts, more opportunities to lose control.
By the seventh sentence, the AI has collected a complete acoustic fingerprint of your speaking ability: your pacing stability, articulation precision, and how predictable your rhythm is to a listener.
What the AI Actually Measures
The scoring engine doesn’t “listen” the way humans do.
It converts your waveform into features and compares them to reference data from high-performing speakers.
These features combine into your Intelligibility score (0–5) and partially inform Delivery.
A stable, predictable rhythm with ≥90% pronunciation accuracy is the hallmark of a high-band performance.
Why Listen & Repeat Comes First
ETS placed this task first for a psychometric reason: it builds a speech baseline before testing reasoning.
It tells the scoring model how you sound when you’re not planning content—only controlling sound.
That baseline then calibrates your Interview Task score.
If your rhythm and pronunciation are strong here, your Interview Task is scored against a clean reference for your voice.
This ensures fairness across accents and L1 backgrounds.
In short: the first task measures how you speak; the second measures what you say.
Task 2: Take an Interview
The Interview Task measures spontaneous academic-interpersonal speaking—your ability to produce extended, reasoned, and coherent speech.
But ETS designed it using the same constructs, so intelligibility still matters deeply.
Each Interview Task follows a controlled four-turn progression.
The pattern keeps every response predictable, measurable, and fair across cultures.
This “function ladder” allows the AI to measure how intelligibility and fluency hold up as complexity increases.
Even when you’re expressing ideas, the model still tracks rhythm, timing, and clarity—because they predict listener comprehension better than vocabulary range alone.
How Intelligibility Connects Both Tasks
Think of Listen & Repeat as your calibration test and Interview as your application test.
The first measures control; the second measures endurance.
If your rhythm breaks down under pressure, your Intelligibility and Delivery scores drop across both tasks.
If you maintain clarity and timing throughout, your composite score climbs sharply.
My research shows that speakers who stay within these bands tend to score in the top quartile:
- Speaking Rate: 140–160 WPM
- Pronunciation Accuracy: ≥90%
- Average Pause Length: ≤0.7 seconds
- Rhythmic Deviation: ≤5% across sentences
These benchmarks define the measurable limits of clear, understandable speech.
How to Practice Intelligibility
You can train intelligibility the same way you train accuracy in writing: by making your data visible.
Every time you record a TOEFL Speaking response on My Speaking Score, you get SpeechRater-style metrics on clarity, timing, and stress balance.
Use them to build a ten-minute daily routine:
When you measure daily, you start to see what clarity sounds like.
Common Intelligibility Breakdowns
This is going to sound like AI, but the goal really isn’t perfection—it’s predictability.
Predictable speech is easier to understand, easier to score, and statistically more stable across test conditions.
In Summary
TOEFL Speaking 2026 measures clarity first, ideas second.
The Listen & Repeat task isolates your control of English sounds.
The Interview task tests whether that control holds when you start thinking out loud.
Together, they redefine what a “high score” means:
- It’s not about accent.
- It’s not about memorization.
- It’s about intelligibility—the science of being understood.
Or as Perfect TOEFL Speaking 2026 puts it:
“Fluency without clarity is noise.
Clarity without rhythm is hesitation.
Intelligibility is both, balanced and measurable.”
FAQ
Q: What is intelligibility in TOEFL Speaking 2026?
A: It’s how easily your speech can be understood. It measures pronunciation accuracy, timing stability, and rhythm consistency.
Q: How many Listen & Repeat prompts are there?
A: Seven. Each one lasts 8, 8, 10, 10, 10, 12, and 12 seconds. The sentences get progressively longer to test stability.
Q: Does accent affect intelligibility?
A: No. The AI is trained to score based on clarity, not accent type.
Q: What does the Interview Task measure?
A: Your ability to organize ideas and maintain intelligibility while reasoning aloud.
Q: How can I practice effectively?
A: Record both tasks, analyze SpeechRater-style feedback, and track pronunciation accuracy, speaking rate, and pausing.
Call to Action
You can’t guess your intelligibility—but you CAN measure it.
Try My Speaking Score for real TOEFL Speaking practice and instant AI feedback.
Every task shows your intelligibility score, speaking rate, and clarity metrics—so you can build the rhythm, timing, and control that define TOEFL Speaking 2026.