How do I tokenize a string sentence in NLTK

Question

I am using nltk  so I want to create my own custom texts just like the default ones on nltk books  However  I ve just got up to the method like  my text     This    is    my    text     I d like to discover any way to input my  text  as   my text    This is my text  this is a nice way to input text     Which method  python s or from nltk allows me to do this  And more important  how can I dismiss punctuation symbols

User · Accepted Answer

This is actually on the main page of nltk org    gt  gt  gt  import nltk  gt  gt  gt  sentence      At eight o clock on Thursday morning     Arthur didn t feel very good      gt  gt  gt  tokens   nltk word tokenize sentence   gt  gt  gt  tokens   At    eight    o clock    on    Thursday    morning    Arthur    did    n t    feel    very    good

User · Answer

As  PavelAnossov answered  the canonical answer  use the word tokenize function in nltk   from nltk import word tokenize sent    This is my text  this is a nice way to input text   word tokenize sent      If your sentence is truly simple enough   Using the string punctuation set  remove punctuation then split using the whitespace delimiter   import string x    This is my text  this is a nice way to input text   y      join  i for i in x if not in string punctuation   split      print y

[python] How do I tokenize a string sentence in NLTK?

Examples related to python

Examples related to nlp

Examples related to tokenize

Examples related to nltk