I am working on a program that I want to filter out some words, with nltk style of removing the stopwords as follows:
def phrasefilter(phrase):
phrase = phrase.replace('hi', 'hello')
phrase = phrase.replace('hey', 'hello')
phrase = re.sub('[^A-Za-z0-9\s]+', '', phrase.lower())
noise_words_set = ['of', 'the', 'at', 'for', 'in', 'and', 'is', 'from', 'are', 'our', 'it', 'its', 'was', 'when', 'how', 'what', 'like', 'whats', 'now', 'panic', 'very']
return ' '.join(w for w in phrase.split() if w.lower() not in noise_words_set)
Is there a way of doing this on web2py DAL.
db.define_table( words,
Field(words1, REQUIRES IS_NOT_NULL(), REQUIRES....
I want to put it in the REQUIRES IS_NOT_IN_NOISE_WORDS_SET() constraints for example. Is this possible? Am working on a user input( with strings saved to the db) where it automatically deletes the stopwords I have chosen instead of the using the snippet shown above.
Copyright Notice:Content Author:「lobjc」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/30772123/how-to-filter-words-in-db-body