How TOEFL Speaking is Scored: The Complete Guide to TOEFL Speaking 2026

Hey I'm John Healy, the founder of My Speaking Score and an expert in TOEFL Speaking assessment. This page has been updated to reflect the new scoring system that evaluates the 2026 version of TOEFL Speaking.

If you are preparing for the TOEFL iBT in 2026, the first thing to understand is that TOEFL Speaking is scored differently than it used to be. ETS updated the test. The Speaking section is shorter, the tasks are different, and TOEFL Speaking scores now report on a 1 to 6 band scale aligned to the CEFR.

This guide answers the core question directly: how is TOEFL Speaking scored on the updated TOEFL iBT, how the Listen and Repeat task is scored, how the Take an Interview task is scored, what the ETS automated scoring engine actually measures, and how your raw Speaking score converts to the 1 to 6 band and to the old 0 to 30 scale.

If you want to understand what is happening under the hood, where penalties come from, and what to fix first, read this end to end. If you only want the short answer, start with the next section.

Quick Answer: How Is TOEFL Speaking Scored?

TOEFL Speaking is scored by an ETS automated scoring engine, with human rater oversight for quality assurance. Your spoken responses are evaluated against ETS scoring rubrics, converted into a raw Speaking score on a 0 to 55 scale, and then reported as a TOEFL Speaking band score from 1 to 6 in 0.5 increments.

The updated TOEFL iBT Speaking section contains two task types: Listen and Repeat and Take an Interview. Each task type has its own rubric. Each response is scored from 0 to 5. The Speaking section takes about 8 minutes of base time and includes 11 items total.

The scoring rubrics focus on measurable speech features: speaking rate, pauses, hesitations, pronunciation, rhythm, prosody, vocabulary range, grammatical accuracy, and discourse organization. Your Speaking band score then maps to a CEFR level, with a 6 aligning to C2 and a 1 aligning to A1.

The ETS rubric groups these features into five scoring dimensions across the two task types: Fluency, Intelligibility, Language Use, Organization, and Repeat Accuracy. Those dimensions are where your Speaking score is actually won or lost.

That is the short version. Below is the full, structured breakdown.

The Updated TOEFL iBT Speaking Section Structure

The updated TOEFL iBT Speaking section is structured around two task types. This is a big change from the previous format, which used four independent and integrated speaking tasks. Knowing the current structure is the first step in understanding how TOEFL Speaking is scored.

Element Updated TOEFL iBT Speaking
Task types Listen and Repeat, Take an Interview
Number of items 11
Base time Approximately 8 minutes
Per-response raw score 0 to 5
Section raw score range 0 to 55
Reported Speaking band 1 to 6, in 0.5 increments
Scoring engine ETS automated scoring engine, with human rater oversight

This is the framework every Speaking score now runs through. Listen and Repeat drives most of your raw score volume. Take an Interview carries heavier weight per response because each response is longer. Both are scored by the ETS automated scoring engine, with human rater oversight in the background.

How the Listen and Repeat Task Is Scored

Listen and Repeat is the task a lot of test takers underestimate. It looks simple. You hear a sentence, you repeat it. The scoring model, though, is measuring specific features of your speech production, not just whether you understood what was said.

In the Listen and Repeat task, you move through a scenario set in an academic or campus life context. You hear 7 sentences of increasing complexity. After each sentence, you have 8 to 12 seconds to repeat it exactly. Each response is scored from 0 to 5 using the Listen and Repeat rubric.

Here is the Listen and Repeat scoring rubric, simplified for clarity.

Score What a typical response looks like
5 Fully intelligible. Exact repetition of the prompt.
4 Captures the meaning. Minor word or grammar changes that do not change meaning. May have one or two ambiguous words due to pronunciation. Self-correction is allowed.
3 Essentially a full sentence, but does not accurately capture the original meaning. Function words or content words may be missing or changed. Occasional intelligibility issues.
2 Missing a significant part of the prompt or highly inaccurate. Not a self-standing sentence. Low intelligibility for a listener unfamiliar with the prompt.
1 Minimal response. Most of the prompt is missing. Mostly unintelligible.
0 No response, entirely unintelligible, not in English, or completely unconnected to the prompt.

The practical implication: you do not have to be perfect to score a 4. Minor function word shifts and small grammar tweaks are acceptable at the 4 level. What kills the score is missing content, unintelligibility, or producing a fragment instead of a full sentence.

The automated scoring engine evaluates Listen and Repeat on three ETS scoring dimensions: Fluency, Intelligibility, and Repeat Accuracy. Those dimensions map to specific speech features.

Listen and Repeat scoring dimension Feature examples
Fluency Speaking rate, length of uninterrupted runs, number of pauses, number of hesitations
Intelligibility Correctness of pronunciation, naturalness of speech rhythm, naturalness of prosody (syllable stress)
Repeat Accuracy Correctly repeated words compared to the prompt

One of the clearest patterns I see in Listen and Repeat: test takers lose more points to hesitation and word-run-together effects than to wrong words. If you get flustered and stall mid-sentence, intelligibility drops and repeat accuracy drops with it. The scoring engine is measuring whether a listener could reliably recover the original sentence from your audio.

How the Take an Interview Task Is Scored

Take an Interview is where your TOEFL Speaking band score is really decided. It asks more of you, and it is graded against a broader rubric.

In the Take an Interview task, you enter a simulated conversation with a prerecorded interviewer. The scenario is academic or campus related: applying for a scholarship, joining a research study, and so on. You answer 4 questions. You have 45 seconds per question. Early questions focus on factual information or personal experience. Later questions ask you to express and support an opinion.

Each response is scored from 0 to 5 using the Take an Interview rubric.

Score What a typical response looks like
5 Fully successful. On topic and well elaborated. Good conversational pace with natural pauses. Pronunciation easily intelligible. Range of accurate grammar and vocabulary.
4 Generally successful. On topic and elaborated, but may lack effective sentence-level connectors. Good pace with some pausing. Occasional words may require minor effort to understand. Grammar and vocabulary adequate for general meanings.
3 Partially successful. Addresses the question with limited elaboration or clarity. Frequent or lengthy pauses produce a choppy pace. Filler words are frequent. Intelligibility sometimes affected by word-level pronunciation or stress issues. Limited range of grammar and vocabulary.
2 Mostly unsuccessful. Minimally connected to the question. Little or no relevant elaboration. Often recycles language from the question itself. Intelligibility limited. Very limited range of grammar and vocabulary.
1 Unsuccessful. Only vaguely connected to the question. Mostly unintelligible. Isolated words or phrases.
0 No response, entirely unintelligible, not in English, or unconnected to the prompt.

The Take an Interview task is scored on four ETS rubric dimensions: Fluency, Intelligibility, Language Use (Vocabulary and Grammar), and Organization. This is where more complex features of English come into play.

Take an Interview scoring dimension Feature examples
Fluency Speaking rate, length of uninterrupted runs, number of pauses, number of hesitations
Intelligibility Correctness of pronunciation, naturalness of speech rhythm, naturalness of prosody (syllable stress)
Language Use: Vocabulary and Grammar Vocabulary diversity (wide range of distinct words), vocabulary richness (less common words), grammaticality, grammatical accuracy (few grammar errors)
Organization Discourse coherence, use of discourse connectives

Here is the nuance. On Listen and Repeat, the question is whether you can accurately reproduce input. On Take an Interview, the question is whether you can produce clear, organized, grammatically controlled language under time pressure. If you drift off topic, recycle the interviewer's wording, or stall with filler words, you cap your score at the 2 or 3 level regardless of your pronunciation.

TOEFL Speaking Scoring: Automated Engine With Human Oversight

ETS scores TOEFL Speaking using an automated scoring engine as the primary scoring method, with human rater oversight for quality assurance.

The automated scoring engine was trained on a large corpus of human-scored TOEFL Speaking responses. For each spoken response, it extracts speech features across the rubric dimensions and outputs a score.

The automated system is not a black box loosely mimicking humans. According to the TOEFL iBT Technical Manual, the Speaking section shows a Human-Machine correlation of 0.89 and a Human-Human correlation of 0.96. The automated engine agrees with human ratings at roughly the same level that two well-trained human raters agree with each other on single responses.

This matters because it means the features the engine measures are the features that move your score. You do not get to argue with the rubric. If your speaking rate is slow, your pauses are frequent, your prosody is unnatural, or your grammar shows frequent errors, the engine will detect it and the score will reflect it.

TOEFL Speaking Band Score: The 1 to 6 Scale

The updated TOEFL iBT reports Speaking on a 1 to 6 band scale in 0.5 increments. This is a break from the old 0 to 30 Speaking scale. ETS made the change to align TOEFL scores more intuitively with the CEFR.

TOEFL Speaking band CEFR level General interpretation
6 C2 Proficient user. Full command of the language.
5 to 5.5 C1 Proficient user. Effective operational command.
4 to 4.5 B2 Independent user. Upper intermediate.
3 to 3.5 B1 Independent user. Lower intermediate.
2 to 2.5 A2 Basic user. Elementary.
1 to 1.5 A1 Basic user. Beginner.

If a university requires a CEFR level of B2, you are targeting a TOEFL Speaking band of 4 or higher. If it requires C1, you are targeting 5 or higher. That is the leverage point for setting a realistic Speaking score goal.

How the New Band Score Maps to the Old 0 to 30 TOEFL Speaking Score

For two years after the update, TOEFL score reports reference both the new 1 to 6 band and the old 0 to 30 scale. If you have a target Speaking score from a university that was written against the old scale, this is the conversion you need.

New TOEFL Speaking band Old Speaking score (0 to 30) Old overall score (0 to 120)
6 28 to 30 114
5.5 27 107+
5 25 to 26 95+
4.5 23 to 24 86+
4 20 to 22 72+
3.5 18 to 19 58+
3 16 to 17 44+
2.5 13 to 15 34+
2 10 to 12 24+
1.5 5 to 9 12+
1 0 to 4 0+

Use this table to translate old-scale target scores. A university asking for a Speaking 26 on the old scale is effectively asking for a band 5 on the updated TOEFL iBT.

What Actually Moves Your TOEFL Speaking Score

This is the section most test takers need the most. Understanding how TOEFL Speaking is scored is only useful if you translate it into concrete behaviors the scoring engine will reward. Here is what the ETS feature set tells you.

Speaking Rate

The scoring engine measures speaking rate directly. Too slow, and Fluency drops. Too fast, and Intelligibility drops because pronunciation degrades. The goal is a conversational pace with natural pauses. Grade 5 responses in the rubric are explicitly described as maintaining "good conversational speaking pace."

Length of Uninterrupted Runs

A "run" is a stretch of speech without a pause. Longer runs suggest you have internalized the language. Frequent short runs broken up by hesitations look like a test taker who is retrieving language word by word. Improve your ability to produce 8 to 12 word runs without pausing.

Number of Pauses and Hesitations

This is the big one for most test takers. The scoring engine counts pauses and hesitations. Frequent filler words, mid-sentence stalls, and "um," "uh," and "like" all register as hesitations. On the rubric, this is the difference between a 4 and a 3 on Take an Interview. At 3, "filler words are frequent." At 4, "pausing may minimally affect flow."

Pronunciation, Rhythm, and Prosody

Intelligibility measures whether your pronunciation, word stress, and intonation carry meaning clearly. You do not need to sound like a native speaker. You do need to be understood without effort. The scoring engine evaluates correctness of pronunciation, naturalness of speech rhythm, and naturalness of prosody (syllable stress). Flat intonation, misplaced syllable stress, and run-together words all drag Intelligibility down.

Vocabulary Diversity and Richness

In Take an Interview, Language Use is scored on vocabulary diversity (range of distinct words) and vocabulary richness (how often you use lower-frequency words). If every response uses the same 80 to 100 words, the engine detects it. Build a bank of precise vocabulary you can deploy under pressure.

Grammaticality and Grammatical Accuracy

The engine measures whether your sentences are grammatically well-formed (grammaticality) and how often errors appear (grammatical accuracy). A 5-level response has "a range of accurate grammar and vocabulary." A 3-level response has "limited range and accuracy of grammar and vocabulary" that "noticeably restrict the precision and clarity of meanings."

Discourse Coherence and Connectives

Take an Interview scores Organization. The engine looks for discourse connectives ("first," "because," "for example," "however," "on the other hand") and for coherence across sentences. Unlinked ideas produce lower scores even if the vocabulary and grammar are fine.

Where TOEFL Speaking Test Takers Lose Points

Across thousands of responses scored on My Speaking Score, the same penalty patterns appear.

On Listen and Repeat, test takers lose points by:dropping the final content word of a long sentence, substituting a function word that changes meaning, running words together and creating intelligibility dips, inserting hesitations that delay the response past the 8 to 12 second limit, and repeating only the first half of the prompt.

On Take an Interview, test takers lose points by:recycling the interviewer's question language instead of producing original language, stalling with filler words during the first 5 to 10 seconds of each response, delivering short responses that do not elaborate, using the same 3 or 4 connectives repeatedly, and drifting off topic in the second half of the 45-second window.

This is where a data-powered prep platform is useful. You want to see the specific features causing your score drop, not guess.

How My Speaking Score Uses This Framework

My Speaking Score is a data-powered TOEFL Speaking prep platform. It scores your practice responses using an automated engine grounded in ETS assessment science and reports back on the same rubric dimensions ETS uses: Fluency, Intelligibility, Language Use, Organization, and Repeat Accuracy.

When you practice a Take an Interview task on My Speaking Score, you get:

an estimated Speaking band score, a per-dimension breakdown of where you are strong and where you are losing points, the specific measurable features (speaking rate, pauses, hesitations, vocabulary diversity, grammatical accuracy, coherence) that are driving your score, and a comparison against rubric-level benchmarks so you can see what a band 4 response actually looks like versus your own.

If you are guessing at what to fix, you are optimizing for the wrong thing. Scoring engines reward specific behaviors. Knowing which ones matter most is the difference between a 3.5 and a 4.5.

FAQ: How Is TOEFL Speaking Scored

How is TOEFL Speaking scored on the updated TOEFL iBT?

TOEFL Speaking is scored by an ETS automated scoring engine, with human rater oversight for quality assurance. You complete 11 items across two task types (Listen and Repeat and Take an Interview). Each response is scored 0 to 5 against ETS rubrics. Raw scores sum to a 0 to 55 range and are reported as a Speaking band from 1 to 6 in 0.5 increments.

What are the new TOEFL Speaking task types?

The updated TOEFL iBT Speaking section has two task types: Listen and Repeat and Take an Interview. Listen and Repeat asks you to repeat 7 sentences of increasing complexity in a scenario. Take an Interview asks you to answer 4 questions in a simulated conversation, with 45 seconds per response.

What is the TOEFL Speaking scoring rubric?

ETS publishes two scoring rubrics, one per task type. Each rubric uses a 0 to 5 scale. Listen and Repeat is scored on intelligibility and repeat accuracy. Take an Interview is scored on fluency, intelligibility, language use (vocabulary and grammar), and organization.

How does the ETS automated scoring engine score TOEFL Speaking?

ETS uses an automated scoring engine to score TOEFL Speaking responses, with human rater oversight for quality assurance. The engine extracts speech features including speaking rate, pauses, hesitations, pronunciation accuracy, rhythm, prosody, vocabulary diversity, grammaticality, and discourse coherence. These features are rolled up into rubric-level dimension scores: Fluency, Intelligibility, Language Use, Organization, and Repeat Accuracy.

What is a good TOEFL Speaking score?

A good TOEFL Speaking score depends on your target. A band score of 4 aligns with CEFR B2 and is roughly equivalent to 20 to 22 on the old 0 to 30 scale. A band score of 5 aligns with CEFR C1 and is roughly equivalent to 25 to 26 on the old scale. Band 6 aligns with C2 and maps to 28 to 30 on the old scale.

How does the new TOEFL Speaking band score convert to the old 0 to 30 scale?

For two years after the test update, TOEFL score reports include both scales. Band 6 corresponds to 28 to 30. Band 5 corresponds to 25 to 26. Band 4 corresponds to 20 to 22. Band 3 corresponds to 16 to 17. Band 2 corresponds to 10 to 12. Band 1 corresponds to 0 to 4.

How long is the updated TOEFL iBT Speaking section?

The Speaking section has a base time of approximately 8 minutes across 11 items. This is significantly shorter than the previous Speaking section.

What is the fastest way to improve a TOEFL Speaking score?

The fastest gains usually come from reducing filler words and hesitations, maintaining a conversational speaking rate, and producing longer uninterrupted runs of speech. These three features drive the Fluency dimension that carries heavy weight on both Listen and Repeat and Take an Interview. Getting a data-powered read on your own performance lets you target the right feature instead of practicing blindly.

Take a Practice Test on My Speaking Score

If you want to see where you stand on the updated TOEFL iBT Speaking section, and where the specific penalties are showing up in your own responses, take a practice test on My Speaking Score. You will get an estimated Speaking band score, a breakdown across the ETS rubric dimensions (Fluency, Intelligibility, Language Use, Organization, and Repeat Accuracy), and concrete feedback on speaking rate, pauses, pronunciation, vocabulary, grammar, and organization. That is a clearer path forward than guessing which part of your speaking to fix next.

Final Takeaway

How TOEFL Speaking is scored in 2026 is actually simpler to understand than it used to be. Two task types. One banded 1 to 6 score. A small number of well-defined scoring dimensions tied to specific measurable features of your speech. If you understand the rubric, train the features, and measure your progress, the score will move. If you treat Speaking as a black box, you will keep hitting the same ceiling.

Know the system. Train the features. Measure the data. That is what a strong TOEFL Speaking band score looks like.