Word Error Rate (WER) Explained — and How to Measure It

What is Word Error Rate?

Word Error Rate (WER) is the standard way to measure how different two pieces of text are — most often a transcript or subtitle against a reference. It answers a simple question: what fraction of words would you have to fix to turn one text into the other? It’s the go-to metric for speech recognition, transcription, translation, and subtitle QA.

How WER is calculated

WER counts three kinds of word-level errors between a hypothesis and a reference:

Substitutions (S) — a word was replaced with a different word.
Deletions (D) — a word in the reference is missing.
Insertions (I) — an extra word appears that isn’t in the reference.

The formula is:

WER = (S + D + I) / N

where N is the number of words in the reference. A WER of 0 means a perfect match; a WER of 0.10 means roughly one error per ten words. It’s often shown as a percentage (10%).

What counts as a good WER?

It depends on the use case, but as a rough guide:

Under 5% — excellent; near human-level for clean audio.
5–10% — good; usable transcripts with light cleanup.
10–20% — noticeable errors; needs review.
Above 20% — significant errors; substantial editing required.

Difficult audio (accents, noise, overlapping speakers) naturally pushes WER up.

How to measure WER between two files

DiffALL computes WER automatically when you compare two subtitle or text files:

Upload or paste both files — for example a reference transcript and an auto-generated one, or two subtitle files (SRT/VTT).
DiffALL reports the overall WER, plus a line-by-line diff showing every substitution, insertion, and deletion.
For subtitles, it also measures timing drift — the millisecond offset between corresponding entries.

Common use cases

Transcription QA — score an auto-generated transcript against a human reference.
Subtitle review — verify a translated or edited subtitle file against the original.
Speech-to-text evaluation — benchmark different ASR engines on the same audio.

Try it now

Want to know exactly how accurate a transcript or subtitle file is? Upload it with the reference and let DiffALL compute the Word Error Rate and show you every error.