How to filter words in db.body
NickName:lobjc Ask DateTime:2015-06-11T13:17:38

How to filter words in db.body

I am working on a program that I want to filter out some words, with nltk style of removing the stopwords as follows:

def phrasefilter(phrase):
    phrase = phrase.replace('hi', 'hello')
    phrase = phrase.replace('hey', 'hello')
    phrase = re.sub('[^A-Za-z0-9\s]+', '', phrase.lower())
    noise_words_set = ['of', 'the', 'at', 'for', 'in', 'and', 'is', 'from', 'are', 'our', 'it', 'its', 'was', 'when', 'how', 'what', 'like', 'whats', 'now', 'panic', 'very']
    return ' '.join(w for w in phrase.split() if w.lower() not in noise_words_set)

Is there a way of doing this on web2py DAL.

db.define_table( words,
    Field(words1, REQUIRES  IS_NOT_NULL(), REQUIRES....

I want to put it in the REQUIRES IS_NOT_IN_NOISE_WORDS_SET() constraints for example. Is this possible? Am working on a user input( with strings saved to the db) where it automatically deletes the stopwords I have chosen instead of the using the snippet shown above.

Copyright Notice:Content Author:「lobjc」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/30772123/how-to-filter-words-in-db-body

More about “How to filter words in db.body” related questions

How to filter words in db.body

I am working on a program that I want to filter out some words, with nltk style of removing the stopwords as follows: def phrasefilter(phrase): phrase = phrase.replace('hi', 'hello') phras...

Show Detail

Bad words filter

Im trying to do a bad words filter with the following function: function sacarmalaspalabras($texto) { $palabras = Array(" sexo "," concha ", " pete "," vagina ", " culo ", " anal ",

Show Detail

Filter stop words in Spark

I am attempting to filter out the stop words out of an RDD of words from a .txt file. // Creating the RDDs val input = sc.textFile("../book.txt") val stopWordsInput = sc.textFile("../stopwords.csv...

Show Detail

How to filter words that contain repeated character?

I've been working on this task that seems pretty simple, but I can't make it work despite reading the man and googling similar questions. I have a file containing thousands of word, and I need to ...

Show Detail

How to extract words from a list of lists and filter words by length?

Basically I want to do two things using python: 1) Make the resulting list a list of words, not a list of lists, and 2) Filter out words that have the length of 1 character. I have to extract ...

Show Detail

Using filter() to filter words starting with a specific character in Python (how to use filter with 2 arguments)

I have this function here to filter out all words from a list which start with a desired character new_list = [] def filter_words(word_list, c): for word in word_list: if word.startswi...

Show Detail

bad words filter without bad words

I need a "bad words" filter without the bad words, because I don't want to have a list of bad words on my system. I'm thinking that the easiest way to do this is with a Bloom Filter used to store the

Show Detail

filter words for searching in mysql

I want to search in a mysql in 4 format:(for example: the expensive book) 1- all words 2- exact word or phrase 3- any of words 4- none of words SELECT text FROM Items WHERE text LIKE "%'.$sear...

Show Detail

ElasticSearch: how to use file with hate words to filter search response

I would like to filter out results containing hate/bad words in the search response. To do this, I decide to use index alias with terms filtering (as described in the answer to this question

Show Detail

Multi words in filter term

I have a document with a tags field contain "john smith" This query returns it: { "query": { "bool": { "filter": { "term": {

Show Detail