Spell Checker for Python

Question

I m fairly new to Python and NLTK  I am busy with an application that can perform spell checks  replaces an incorrectly spelled word with the correct one   I m currently using the Enchant library on Python 2 7  PyEnchant and the NLTK library  The code below is a class that handles the correction replacement    from nltk metrics import edit distance  class SpellingReplacer      def   init   self  dict name  en GB   max dist 2           self spell dict   enchant Dict dict name          self max dist   2      def replace self  word           if self spell dict check word               return word         suggestions   self spell dict suggest word           if suggestions and edit distance word  suggestions 0    lt   self max dist              return suggestions 0          else              return word   I have written a function that takes in a list of words and executes replace   on each word and then returns a list of those words  but spelled correctly   def spell check word list       checked list          for item in word list          replacer   SpellingReplacer           r   replacer replace item          checked list append r      return checked list   gt  gt  gt  word list     car    colour    gt  gt  gt  spell check words    car    color     Now  I don t really like this because it isn t very accurate and I m looking for a way to achieve spelling checks and replacements on words  I also need something that can pick up spelling mistakes like  caaaar   Are there better ways to perform spelling checks out there  If so  what are they  How does Google do it  Because their spelling suggester is very good   Any suggestions

User · Answer

Try jamspell - it works pretty well for automatic spelling correction  import jamspell  corrector   jamspell TSpellCorrector   corrector LoadLangModel  en bin    corrector FixFragment  Some sentnec with error     u Some sentence with error   corrector GetCandidates   Some    sentnec    with    error    1      sentence    senate    scented    sentinel

User · Answer

from autocorrect import spell for this you need to install  prefer anaconda and it only works for words  not sentences so that s a limitation u gonna face  from autocorrect import spell print spell  intrerpreter      output  interpreter

User · Answer

You can use the autocorrect lib to spell check in python  Example Usage  from autocorrect import Speller  spell   Speller lang  en    print spell  caaaar    print spell  mussage    print spell  survice    print spell  hte     Result  caesar message service the

User · Answer

pyspellchecker is the one of the best solutions for this problem  pyspellchecker library is based on Peter Norvig   s blog post  It uses a Levenshtein Distance algorithm to find permutations within an edit distance of 2 from the original word  There are two ways to install this library  The official document highly recommends using the pipev package   install using pip  pip install pyspellchecker   install from source  git clone https   github com barrust pyspellchecker git cd pyspellchecker python setup py install  the following code is the example provided from the documentation from spellchecker import SpellChecker  spell   SpellChecker      find those words that may be misspelled misspelled   spell unknown   something    is    hapenning    here     for word in misspelled        Get the one  most likely  answer     print spell correction word          Get a list of  likely  options     print spell candidates word

User · Answer

The best way for spell checking in python is by  SymSpell  Bk-Tree or Peter Novig s method    The fastest one is SymSpell   This is Method1  Reference link pyspellchecker  This library is based on Peter Norvig s implementation   pip install pyspellchecker  from spellchecker import SpellChecker  spell   SpellChecker      find those words that may be misspelled misspelled   spell unknown   something    is    hapenning    here     for word in misspelled        Get the one  most likely  answer     print spell correction word          Get a list of  likely  options     print spell candidates word     Method2  SymSpell Python  pip install -U symspellpy

User · Answer

I d recommend starting by carefully reading this post by Peter Norvig   I had to something similar and I found it extremely useful    The following function  in particular has the ideas that you now need to make your spell checker more sophisticated  splitting  deleting  transposing  and inserting the irregular words to  correct  them   def edits1 word      splits         word  i   word i    for i in range len word    1      deletes       a   b 1   for a  b in splits if b     transposes    a   b 1    b 0    b 2   for a  b in splits if len b  gt 1     replaces      a   c   b 1   for a  b in splits for c in alphabet if b     inserts       a   c   b     for a  b in splits for c in alphabet     return set deletes   transposes   replaces   inserts    Note  The above is one snippet from Norvig s spelling corrector  And the good news is that you can incrementally add to and keep improving your spell-checker   Hope that helps

User · Answer

Spark NLP is another option that I used and it is working excellent  A simple tutorial can be found here  https   github com JohnSnowLabs spark-nlp-workshop blob master jupyter annotation english spell-check-ml-pipeline Pretrained-SpellCheckML-Pipeline ipynb

User · Answer

spell corrector- gt  you need to import a corpus on to your desktop if you store elsewhere change the path in the code i have added a few graphics as well using tkinter and this is only to tackle non word errors   def min edit dist word1 word2       len 1 len word1      len 2 len word2      x     0   len 2 1  for   in range len 1 1   the matrix whose last element - gt edit distance     for i in range 0 len 1 1              initialization of base case values         x i  0  i         for j in range 0 len 2 1               x 0  j  j     for i in range  1 len 1 1           for j in range 1 len 2 1               if word1 i-1   word2 j-1                   x i  j    x i-1  j-1              else                   x i  j   min x i  j-1  x i-1  j  x i-1  j-1   1     return x i  j  from Tkinter import     def retrieve text        global word1     word1  app entry get        path  quot C  Documents and Settings Owner Desktop Dictionary txt quot      ffile open path  r       lines ffile readlines       distance list        print  quot Suggestions coming right up count till 10 quot      for i in range 0 58109           dist min edit dist word1 lines i           distance list append dist      for j in range 0 58109           if distance list j  lt  2              print lines j              print quot   quot         ffile close   if   name       quot   main   quot       app win   Tk       app win title  quot spell quot       app label   Label app win  text  quot Enter the incorrect word quot       app label pack       app entry   Entry app win      app entry pack       app button   Button app win  text  quot Get Suggestions quot   command retrieve text      app button pack         Initialize GUI loop     app win mainloop

User · Answer

Maybe it is too late  but I am answering for future searches  TO perform spelling mistake correction  you first need to make sure the word is not absurd or from slang like  caaaar  amazzzing etc  with repeated alphabets  So  we first need to get rid of these alphabets  As we know in English language words usually have a maximum of 2 repeated alphabets  e g   hello   so we remove the extra repetitions from the words first and then check them for spelling   For removing the extra alphabets  you can use Regular Expression module in Python   Once this is done use Pyspellchecker library from Python for correcting spellings   For implementation visit this link  https   rustyonrampage github io text-mining 2017 11 28 spelling-correction-with-python-and-nltk html

[python] Spell Checker for Python

Examples related to python

Examples related to python-2.7

Examples related to nltk

Examples related to spell-checking

Examples related to pyenchant