Frequency Based Vocabulary Check: Audit Any App Fast (2026)

You open a new language app, start Lesson 1, and five minutes later you’re learning “parrot”, “suitcase”, and “canyon”. It feels fun, but also… off. Are you building a foundation, or collecting trivia?

A frequency based vocabulary approach fixes that problem by teaching words you’ll meet constantly in real text and speech. The catch is that apps don’t always say how their word lists were built, and “common words” can mean almost anything.

This guide gives you a quick audit method, a scoring rubric, and a few fast red flags that reveal random word lists before you waste weeks.

What “frequency-based vocabulary” looks like in a real course

Table of Contents

A frequency-based course doesn’t mean “boring words only”. It means the early units are packed with words that unlock lots of sentences quickly: function words (the, and, to), core verbs (be, have, go), basic adjectives (good, new), and everyday nouns (time, person, day). These words aren’t flashy, but they carry huge weight.

This happens for a simple reason: languages follow a pattern often described by Zipf’s law. A small set of words shows up again and again, and then word frequency drops fast. So the first few thousand lemmas (dictionary base forms like go instead of goes/went) give you far more “coverage” than the next few thousand.

A good app doesn’t have to be perfectly sorted by frequency, but you should see these signs early:

New vocabulary supports lots of basic sentence frames, not just labels for objects.
You learn words that combine well (common verb plus common noun), not isolated museum pieces.
The same high-use words reappear across topics because real language repeats.

If you’re comparing apps, treat vocabulary strategy as a core feature, not a footnote. Many “which app is best” comparisons focus on price and UI, but vocabulary selection can decide whether you reach comfortable reading and listening in months or in years. If you’re still shopping, a side-by-side review can help you ask better questions, for example this Rosetta Stone vs Duolingo comparison, then you can add the frequency audit below.

A 15-minute spot check: pull a word list, lemmatize it, and estimate frequency

You don’t need special software. You need a small sample and one decent frequency list.

Step-by-step spot check method

Pick an early lesson, ideally within the first 5 to 10 lessons (or A1 Unit 1).
Export the vocab list if the app allows it. If not, screenshot the “new words” screen and type them into a notes app. Aim for 30 to 50 items.
Normalize to lemmas (base forms).
- Verbs: goes, went, going → go
- Nouns: dogs → dog
- Adjectives: bigger → big If the app teaches inflections on purpose, keep the taught form in a second column, but score frequency using the lemma.
Remove obvious proper nouns (Paris, Netflix, Juan) into their own bucket. Don’t let them distort the result.
Check frequency using a trusted resource (pick one that fits your language).
- For American English, SUBTLEX is widely used: see the official SUBTLEXus word frequency page.
- For multi-language frequency dictionaries and coverage stats, Leipzig is practical: Leipzig Frequency Dictionaries.
- For quick, language-specific lists, Wiktionary can work as a starting point: Wiktionary frequency lists.
- For English lists often used in teaching materials, you can also consult a BNC/COCA-style resource, for example the BNC-COCA lists project.
Estimate “how much is truly common”.
- Count how many lemmas land in the top 2,000 (very strong signal).
- Count how many land in the top 5,000 (still useful for everyday topics). You won’t get perfect matches every time (spelling variants, multiword expressions), but a pattern will show up fast.
Interpret your result (for early lessons):
- If most lemmas are in the top 2,000, the app is likely aiming for frequency-based coverage.
- If a large chunk is outside the top 5,000, the list is probably theme-first, not frequency-first.

Quick 0 to 10 scoring rubric (add up the points)

Criterion (0–2 each)	0 points	1 point	2 points
Top-2,000 coverage (early lesson sample)	Under 40%	40–65%	Over 65%
Lemma handling (go/went grouped)	No, forms treated as separate “words”	Mixed	Yes, consistent
Function words appear early	Rare or avoided	Some	Frequent and practiced
Collocations and chunks	Single-word drilling only	Some phrases	Many common chunks
Transparency	No mention of source or method	Vague “common words”	Clear corpus or list named

A score of 8 to 10 usually means the app’s vocabulary plan is intentional. A 4 to 7 means it’s mixed. A 0 to 3 means you’re likely looking at themed lists dressed up as a curriculum.

Red flags that reveal random word lists (even when SRS looks good)

Some apps hide weak vocabulary choices behind a slick review system. That’s why it helps to separate two ideas:

Frequency-based ordering decides what you learn first.
Spaced repetition (SRS) decides what you review, and when.

An app can have excellent SRS and still teach low-value words early. SRS is the treadmill. Frequency is the route.

Here are the fastest red flags to spot:

Proper nouns and “tourist brochure” words too soon

A few place names are fine, but if Lesson 2 is heavy on cities, brands, and famous people, the list is probably curated for vibes, not usefulness.

Niche themed units with no core scaffolding

“Kitchen tools” can be helpful, but it shouldn’t replace basics like common verbs, pronouns, and connectors. If you can name 20 objects but can’t say “I want it” or “I don’t know”, the order is off.

No lemmatization, so the list looks bigger than it is

If an app treats go, goes, went, going as four separate vocabulary wins, it inflates progress and slows coverage. You want word families handled smartly, even if forms are taught.

Archaic or rare words mixed into the first wave

Random lists often sprinkle in unusual items (formal synonyms, literary words) early because they fit a topic set. Frequency-based courses delay those until you have enough core language to make them stick.

Unrealistic part-of-speech mix

Real beginner input is heavy on function words plus a small set of high-use verbs. If your “new words” are mostly nouns and adjectives, you’re building a sticker book, not a working language.

Ignoring collocations and common chunks

High-frequency vocabulary isn’t just single words. It’s also combinations like “have to”, “I want to”, “in front of”, and verb plus noun pairings that show up constantly. If the app avoids chunks, your speaking will sound stitched together.

Conclusion

A frequency-based course should feel like learning the joints and muscles of a language, not collecting souvenirs. With one early-lesson sample, a quick lemma pass, and a check against a public frequency resource, you can see whether an app is built around frequency based vocabulary or around random themes.

Try the spot check on two apps you’re considering and score them. The numbers won’t be perfect, but the pattern will be clear. If the early words don’t buy you real sentences, it’s time to keep shopping.

How to check if a language app’s vocabulary is frequency-based (and how to spot random word lists fast)