Implementing AI-Powered Typo Correction in Mobile Search
Mobile keyboards miss. Swipe input, autocorrect, small keys—mobile typo rates are 2–3x higher than desktop. Yet users expect the search to understand them, not return "0 results." "0 results" means they leave the app.
Types of errors and correction approaches
Mobile input errors fall into three categories, each requiring its own tool:
Typos (transposition, deletion, substitution) — "krosovki" instead of "krossovki." Handled by edit distance algorithms: Levenshtein, Damerau-Levenshtein (accounts for transposed adjacent characters, typical in mobile).
Phonetic errors — users write phonetically: "naik" → "nike". For languages with character variations, use phonetic encoders.
Transliteration — "krossovki," "кроссовки," "crossovki" should return identical results. Standard transliteration tables + normalization before indexing.
Elasticsearch: spell correction out of the box and its limits
ES provides term suggester with fuzzy matching. It works, but:
- searches for nearest terms in index by edit distance—doesn't consider query context
- weak on short tokens (< 4 characters) due to high edit distance variants
- no frequency awareness: a misspelling and a brand term get equal priority
# ES term suggester — basic level
response = await es.search(
index="products",
body={
"suggest": {
"spell_suggest": {
"text": query,
"term": {
"field": "title",
"suggest_mode": "missing", # only if term not found
"max_edits": 2,
"min_word_length": 4,
"string_distance": "jaro_winkler"
}
}
}
}
)
jaro_winkler works better for short strings than levenshtein—it gives more weight to matches at the string start.
SymSpell: orders of magnitude faster than Levenshtein
For production at > 1000 queries/sec, standard Levenshtein doesn't work due to O(n²) complexity. SymSpell (Symmetric Delete) precomputes all possible deletions up to max edit distance and stores in a hash table. Lookup is O(1) for most queries.
from symspellpy import SymSpell, Verbosity
sym_spell = SymSpell(max_dictionary_edit_distance=2, prefix_length=7)
sym_spell.load_dictionary("frequency_dict.txt", term_index=0, count_index=1)
def correct_query(query: str) -> str:
suggestions = sym_spell.lookup_compound(
query,
max_edit_distance=2,
transfer_casing=True
)
if suggestions and suggestions[0].distance > 0:
return suggestions[0].term
return query
Build the frequency dictionary from your app's search logs—important: "leather belt" will be top-frequency in your domain, not "leather jacket" from a generic dictionary.
Contextual correction via N-gram Language Model
SymSpell corrects each word independently. "krosovki adidas" gets both words right. But "white krosovki"—SymSpell might suggest "white" or "whiter" without knowing which is grammatically correct in context.
An N-gram language model trained on search logs helps pick the right variant: P("white sneakers") >> P("whiter sneakers").
Mobile integration: correction UX
// Android: display correction with revert option
@Composable
fun SearchResultsHeader(
originalQuery: String,
correctedQuery: String?,
onRevertToOriginal: () -> Unit
) {
if (correctedQuery != null && correctedQuery != originalQuery) {
Row(
modifier = Modifier.padding(horizontal = 16.dp, vertical = 8.dp),
verticalAlignment = Alignment.CenterVertically
) {
Text(
text = buildAnnotatedString {
append("Results for: ")
withStyle(SpanStyle(fontWeight = FontWeight.Bold)) {
append(correctedQuery)
}
}
)
Spacer(modifier = Modifier.weight(1f))
TextButton(onClick = onRevertToOriginal) {
Text("Search «$originalQuery»")
}
}
}
}
The pattern "we corrected it—but you can revert" is industry standard. Never force correction without an escape hatch.
// iOS: analogous approach via SwiftUI
struct CorrectionNoticeView: View {
let original: String
let corrected: String
let onRevert: () -> Void
var body: some View {
HStack {
Text("Showing results for «\(corrected)»")
.font(.subheadline)
Spacer()
Button("Search «\(original)»", action: onRevert)
.font(.subheadline)
}
.padding(.horizontal)
.padding(.vertical, 6)
.background(Color(.systemGray6))
}
}
Process
Collect frequency dictionary from app search logs.
Analyze typical typos: which characters are confused on keyboards.
Set up SymSpell + ES phrase suggester.
Integrate correction into search API + client UX.
Quality metric: zero-result rate before and after deployment.
Timeline estimates
ES fuzzy + SymSpell with off-the-shelf dictionary—2–4 days. Custom frequency dictionary from logs + N-gram LM for contextual correction—1–2 weeks additional.







