online read us now
Paper details
Number 2 - June 2005
Volume 15 - 2005
Correcting spelling errors by modelling their causes
Sebastian Deorowicz, Marcin G. Ciura
Abstract
This paper accounts for a new technique of correcting isolated words in typed texts. A language-dependent set of string substitutions reflects the surface form of errors that result from vocabulary incompetence, misspellings, or mistypings. Candidate corrections are formed by applying the substitutions to text words absent from the computer lexicon. A minimal acyclic deterministic finite automaton storing the lexicon allows quick rejection of nonsense corrections, while costs associated with the substitutions serve to rank the remaining ones. A comparison of the correction lists generated by several spellcheckers for two corpora of English spelling errors shows that our technique suggests the right words more accurately than the others.
Keywords
spelling correction, finite state automata, spelling errors