Updated answer:NLTK works for 2.7 well. I had 3.2. I uninstalled 3.2 and installed 2.7. Now it works!!
I have installed NLTK and tried to download NLTK Data. What I did was to follow the instrution on this site: http://www.nltk.org/data.html
I downloaded NLTK, installed it, and then tried to run the following code:
>>> import nltk
>>> nltk.download()
It gave me the error message like below:
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
nltk.download()
AttributeError: 'module' object has no attribute 'download'
Directory of C:\Python32\Lib\site-packages
Tried both nltk.download()
and nltk.downloader()
, both gave me error messages.
Then I used help(nltk)
to pull out the package, it shows the following info:
NAME
nltk
PACKAGE CONTENTS
align
app (package)
book
ccg (package)
chat (package)
chunk (package)
classify (package)
cluster (package)
collocations
corpus (package)
data
decorators
downloader
draw (package)
examples (package)
featstruct
grammar
help
inference (package)
internals
lazyimport
metrics (package)
misc (package)
model (package)
parse (package)
probability
sem (package)
sourcedstring
stem (package)
tag (package)
test (package)
text
tokenize (package)
toolbox
tree
treetransforms
util
yamltags
FILE
c:\python32\lib\site-packages\nltk
I do see Downloader there, not sure why it does not work. Python 3.2.2, system Windows vista.
I think you must have named the file as nltk.py (or the folder consists of a file with that name) so change it to any other name and try executing it....
To download a particular dataset/models, use the nltk.download()
function, e.g. if you are looking to download the punkt
sentence tokenizer, use:
$ python3
>>> import nltk
>>> nltk.download('punkt')
If you're unsure of which data/model you need, you can start out with the basic list of data + models with:
>>> import nltk
>>> nltk.download('popular')
It will download a list of "popular" resources, these includes:
<collection id="popular" name="Popular packages">
<item ref="cmudict" />
<item ref="gazetteers" />
<item ref="genesis" />
<item ref="gutenberg" />
<item ref="inaugural" />
<item ref="movie_reviews" />
<item ref="names" />
<item ref="shakespeare" />
<item ref="stopwords" />
<item ref="treebank" />
<item ref="twitter_samples" />
<item ref="omw" />
<item ref="wordnet" />
<item ref="wordnet_ic" />
<item ref="words" />
<item ref="maxent_ne_chunker" />
<item ref="punkt" />
<item ref="snowball_data" />
<item ref="averaged_perceptron_tagger" />
</collection>
In case anyone is avoiding errors from downloading larger datasets from nltk
, from https://stackoverflow.com/a/38135306/610569
$ rm /Users/<your_username>/nltk_data/corpora/panlex_lite.zip
$ rm -r /Users/<your_username>/nltk_data/corpora/panlex_lite
$ python
>>> import nltk
>>> dler = nltk.downloader.Downloader()
>>> dler._update_index()
>>> dler._status_cache['panlex_lite'] = 'installed' # Trick the index to treat panlex_lite as it's already installed.
>>> dler.download('popular')
From v3.2.5, NLTK has a more informative error message when nltk_data
resource is not found, e.g.:
>>> from nltk import word_tokenize
>>> word_tokenize('x')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/l/alvas/git/nltk/nltk/tokenize/__init__.py", line 128, in word_tokenize
sentences = [text] if preserve_line else sent_tokenize(text, language)
File "/Users//alvas/git/nltk/nltk/tokenize/__init__.py", line 94, in sent_tokenize
tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
File "/Users/alvas/git/nltk/nltk/data.py", line 820, in load
opened_resource = _open(resource_url)
File "/Users/alvas/git/nltk/nltk/data.py", line 938, in _open
return find(path_, path + ['']).open()
File "/Users/alvas/git/nltk/nltk/data.py", line 659, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource punkt not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('punkt')
Searched in:
- '/Users/alvas/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- ''
**********************************************************************
To find nltk_data
directory (auto-magically), see https://stackoverflow.com/a/36383314/610569
To download nltk_data
to a different path, see https://stackoverflow.com/a/48634212/610569
To config nltk_data
path (i.e. set a different path for NLTK to find nltk_data
), see https://stackoverflow.com/a/22987374/610569
You may try:
>> $ import nltk
>> $ nltk.download_shell()
>> $ d
>> $ *name of the package*
happy nlp'ing.
you should add python to your PATH during installation of python...after installation.. open cmd prompt type command-pip install nltk
then go to IDLE and open a new file..save it as file.py..then open file.py
type the following:
import nltk
nltk.download()
Please Try
import nltk
nltk.download()
After running this you get something like this
NLTK Downloader
---------------------------------------------------------------------------
d) Download l) List u) Update c) Config h) Help q) Quit
---------------------------------------------------------------------------
Then, Press d
Do As Follows:
Downloader> d all
You will get following message on completion, and Prompt then Press q
Done downloading collection all
I had the similar issue. Probably check if you are using proxy.
If yes, set up the proxy before doing download:
nltk.set_proxy('http://proxy.example.com:3128', ('USERNAME', 'PASSWORD'))
Install Pip: run in terminal : sudo easy_install pip
Install Numpy (optional): run : sudo pip install -U numpy
Install NLTK: run : sudo pip install -U nltk
Test installation: run: python
then type : import nltk
To download the corpus
run : python -m nltk.downloader all
If you are running a really old version of nltk, then there is indeed no download module available (reference)
Try this:
import nltk
print(nltk.__version__)
As per the reference, anything after 0.9.5 should be fine
This worked for me:
nltk.set_proxy('http://user:[email protected]:8080')
nltk.download()
Try download the zip files from http://www.nltk.org/nltk_data/ and then unzip, save in your Python folder, such as C:\ProgramData\Anaconda3\nltk_data
you can't have a saved python file called nltk.py
because the interpreter is reading from that and not from the actual file.
Change the name of your file that the python shell is reading from and try what you were doing originally:
import nltk
and then nltk.download()
Try
nltk.download('all')
this will download all the data and no need to download individually.
if you have already saved a file name nltk.py and again rename as my_nltk_script.py. check whether you have still the file nltk.py existing. If yes, then delete them and run the file my_nltk.scripts.py it should work!
It's very simple....
import nltk
nltk.download()
Do not name your file nltk.py I used the same code and name it nltk, and got the same error as you have, I changed the file name and it went well.
Source: Stackoverflow.com