How does EditLive! decide which words to check for spelling errors?

We've seen a few customers recently creating custom dictionaries with words that simply won't be checked by our spelling checker; we use a third party spelling component so we don't have much control over this process.  Here are the rules that the component uses to determine which words will be checked for spelling errors, keep these in mind when constructing a custom dictionary.

  • A word is an alphanumeric character followed by any sequence of alphanumerics or apostrophes.
  • Hyphens are word delimiters, hyphenated words will be checked separately.
  • Periods surrounded by alphanumerics are considered part of the word, and trailing periods are considered part of the word if the word contains embedded periods interspersed among no more than two consecutive alphanumerics (e.g., the period at the end of U.S.A. is considered part of the word, but the periods at the end of USA. and ephox.com. are not).
  • Apostrophes at the end of a word are considered part of the word if they are preceded by the letter "s".
  • An "at sign" (@) is considered part of the word if it is surrounded by alphanumerics and the following word contains embedded periods (i.e., appears to be an e-mail address).
  • The string "://" is considered part of the word if it is surrounded by alphanumerics (i.e., appears to be part of a URL).
  • A slash (/) is considered part of the word if it is surrounded by alphanumerics and the preceding part contains embedded periods (i.e., appears to be part of a URL).
  • Characters &, %, +, =, ?, and _ are considered part of the word if the word contains embedded periods (i.e., appears to be part of a URL).

There are some special cases that we support.  Both can be changed by creating a file called "Spelling.properties" in your dictionary jar file and adding the text as specified below.  If you specify both options, each must be on a separate line.  For an example of this, look at the contents of our French and Italian dictionary jar files.

  • Hyphenated words can be checked as a single word with the option SPLIT_HYPHENATED_WORDS_OPT=false.  With this turned off, hyphens surrounded by alphanumerics are considered part of the word.
  • Apostrophes can be turned into word delimiters with the option SPLIT_CONTRACTED_WORDS_OPT=true.  This is turned on by default in our French and Italian dictionaries.

Note that both of these options are global, if changed they will apply to the entire editor - not just your custom dictionary.

Andy is Ephox's programming switch hitter, there aren't many products at Ephox that Andy hasn't been involved in. He spends his days trying to convince people that they should listen to more podcasts.

Leave a Reply