Iv tried multiple times and ways for removing the extra punctuation from the string.
import string
class NLP:
def __init__(self,sentence):
self.sentence = sentence.lower()
self.tokenList = []
#problem were the punct is still included in word
def tokenize(self, sentence):
for word in sentence.split():
self.tokenList.append(word)
for i in string.punctuation:
if(i in word):
word.strip(i)
self.tokenList.append(i)
quick explanation of the code... What it is suppose to do is to split each word and punctuation and store them in a list. But when i have punctuation next to a word it stays with the word. Below is an example where a comma remains grouped with the word 'hello'
['hello,' , ',' , 'my' , 'name' , 'is' , 'freddy']
#^
#there's the problem