Relative Content

Tag Archive for pythonregexnltktokenize

Match punctuation sign or end of a line

I want to improve the NLTK sentence tokenizer. Unfortunately, it doesn’t work too well when the text doesn’t leave any whitespace between the period and the next sentence.