Skip to content Skip to sidebar Skip to footer

Extract Non-content English Language Words String - Python

I am working on Python script in which I want to remove the common english words like 'the','an','and','for' and many more from a String. Currently what I have done is I have made

Solution 1:

this will also work:

yourString = "an elevator is made for five people and it's fast"
wordsToRemove = ["the ", "an ", "and ", "for "]

for word in wordsToRemove:
    yourString = yourString .replace(word, "")

Solution 2:

I have found that what I was looking for is this:

from nltk.corpus importstopwordsmy_stop_words= stopwords.words('english')

Now I can remove or replace the words from my list/string where I find the match in my_stop_words which is a list.

For this to work I had to download the NLTK for python and the using its downloader I downloaded stopwords package.

It also contains many other packages which can be used in different situations for NLP like words,brown,wordnet etc.

Post a Comment for "Extract Non-content English Language Words String - Python"