Extract Non-content English Language Words String - Python
I am working on Python script in which I want to remove the common english words like 'the','an','and','for' and many more from a String. Currently what I have done is I have made
Solution 1:
this will also work:
yourString = "an elevator is made for five people and it's fast"
wordsToRemove = ["the ", "an ", "and ", "for "]
for word in wordsToRemove:
yourString = yourString .replace(word, "")
Solution 2:
I have found that what I was looking for is this:
from nltk.corpus importstopwordsmy_stop_words= stopwords.words('english')
Now I can remove or replace the words from my list/string where I find the match in my_stop_words which is a list.
For this to work I had to download the NLTK for python and the using its downloader I downloaded stopwords package.
It also contains many other packages which can be used in different situations for NLP like words,brown,wordnet etc.
Post a Comment for "Extract Non-content English Language Words String - Python"