I have a string contain words, each word has its own token (eg. NN/NNP/JJ etc). I want to take specific repeat words that contain NNP token. My code so far:
import re
sentence = "Rapunzel/NNP Sheila/NNP let/VBD down/RP her/PP$ long/JJ golden/JJ hair/NN in Yasir/NNP"
tes = re.findall(r'(\w+)/NNP', sentence)
print(tes)
The result of the code:
['Rapunzel', 'Sheila', 'Yasir']
As we see, there are 3 words contain NNP those are Rapunzel/NNP Sheila/NNP (appear next to each other) and Yasir/NNP (seperate by words to other NNP words). My problem is I need to sperate the word with repeat NNP and the other. My expected result is like :
['Rapunzel/NNP', 'Sheila/NNP'], ['Yasir/NNP']
What is the best way to perform this task, thanks.
['Rapunzel/NNP', 'Sheila/NNP'], ['Yasir/NNP']and not['Rapunzel', 'Sheila'], ['Yasir']? You set a capturing group in your pattern around\w+- is it a "typo"?\w+is not a typo, I guess its mean to detect any letter before/NNP. correct me if I am wrong. thanks