Resource u tokenizers punkt english pickle not found

Question

My Code   import nltk data tokenizer   nltk data load  nltk tokenizers punkt english pickle     ERROR Message    ec2-user ip-172-31-31-31 sentiment   python mapper local v1 0 py Traceback  most recent call last   File  mapper local v1 0 py   line 16  in  lt module gt       tokenizer   nltk data load  nltk tokenizers punkt english pickle    File   usr lib python2 6 site-packages nltk data py   line 774  in load      opened resource    open resource url   File   usr lib python2 6 site-packages nltk data py   line 888  in  open      return find path   path         open    File   usr lib python2 6 site-packages nltk data py   line 618  in find      raise LookupError resource not found   LookupError   Resource u tokenizers punkt english pickle  not found   Please use the NLTK Downloader to obtain the resource        gt  gt  gt nltk download    Searched in  -   home ec2-user nltk data  -   usr share nltk data  -   usr local share nltk data  -   usr lib nltk data  -   usr local lib nltk data  - u     I m trying to run this program in Unix machine   As per the error message  I logged into python shell from my unix machine then I used the below commands   import nltk nltk download     and then I downloaded all the available things using d- down loader and l- list options but still the problem persists   I tried my best to find the solution in internet but I got the same solution what I did as I mentioned in my above steps

User · Answer

The same thing happened to me recently  you just need to download the  punkt  package and it should work   When you execute  list   l  after having  downloaded all the available things   is everything marked like the following line          punkt                Punkt Tokenizer Models   If you see this line with the star  it means you have it  and nltk should be able to load it

User · Answer

Go to python console by typing        python   in your terminal  Then  type the following 2 commands in your python shell to install the respective packages         nltk download  punkt        nltk download  averaged perceptron tagger     This solved the issue for me

User · Answer

For me it got solved by using  quot nltk  quot  http   www nltk org howto data html Failed loading english pickle with nltk data load sent tokenizer nltk data load  nltk tokenizers punkt english pickle

User · Answer

I faced same issue  After downloading everything  still  punkt  error was there  I searched package on my windows machine at C  Users vaibhav AppData Roaming nltk data tokenizers and I can see  punkt zip  present there  I realized that somehow the zip has not been extracted into C  Users vaibhav AppData Roaming nltk data tokenizers punk  Once I extracted the zip  it worked like music

User · Answer

To add to alvas  answer  you can download only the punkt corpus   nltk download  punkt     Downloading all sounds like overkill to me  Unless that s what you want

User · Answer

import nltk nltk download  punkt     Open the Python prompt and run the above statements   The sent tokenize function uses an instance of PunktSentenceTokenizer from the nltk tokenize punkt module  This instance has already been trained and works well for many European languages  So it knows what punctuation and characters mark the end of a sentence and the beginning of a new sentence

User · Answer

My issue was that I called nltk download  all   as the root user  but the process that eventually used nltk was another user who didn t have access to  root nltk data where the content was downloaded     So I simply recursively copied everything from the download location to one of the paths where NLTK was looking to find it like this   cp -R  root nltk data   home ubuntu nltk data

User · Answer

For me nothing of the above worked  so I just downloaded all the files by hand from the web site http   www nltk org nltk data  and I put them also by hand in a file  tokenizers  inside of  nltk data  folder  Not a pretty solution but still a solution

User · Answer

If you re looking to only download the punkt model   import nltk nltk download  punkt     If you re unsure which data model you need  you can install the popular datasets  models and taggers from NLTK   import nltk nltk download  popular     With the above command  there is no need to use the GUI to download the datasets

User · Answer

From the shell you can execute   sudo python -m nltk downloader punkt    If you want to install the popular NLTK corpora models   sudo python -m nltk downloader popular   If you want to install all NLTK corpora models   sudo python -m nltk downloader all   To list the resources you have downloaded   python -c  import os  import nltk  print os listdir nltk data find  corpora     python -c  import os  import nltk  print os listdir nltk data find  tokenizers

User · Answer

Just make sure you are using Jupyter Notebook and in a notebook  do the following   import nltk  nltk download     Then one popup window will appear  showing info https   raw githubusercontent com nltk nltk data gh-pages index xml   From that you have to download everything   Then rerun your code

User · Answer

You need to rearrange your folders Move your tokenizers folder into nltk data folder  This doesn t work if you have nltk data folder containing corpora folder containing  tokenizers folder

User · Answer

I was getting an error despite importing the following    import nltk nltk download     but for google colab this solved my issue        python3 -c  import nltk  nltk download  all

User · Answer

Execute the following code   import nltk nltk download    After this  NLTK downloader will pop out  Select All packages  Download punkt

User · Answer

Add the following lines into your script  This will automatically download the punkt data  import nltk nltk download  punkt

User · Answer

After adding this line of code  the issue will be fixed   nltk download  punkt

User · Answer

Simple nltk download   will not solve this issue  I tried the below and it worked for me   in the nltk folder create a tokenizers folder and copy your punkt folder into tokenizers folder   This will work   the folder structure needs to be as shown in the picture

User · Answer

I got the solution   import nltk nltk download     once the NLTK Downloader starts      d  Download   l  List    u  Update   c  Config   h  Help   q  Quit  Downloader  d  Download which package  l list  x cancel     Identifier  punkt

[python] Resource u'tokenizers/punkt/english.pickle' not found

Examples related to python

Examples related to unix

Examples related to nltk