2

I am parsing Java source code using Python. I need to extract the comment text from the source. I have tried the following.

Take 1:

cmts = re.findall(r'/\*\*(.|[\r\n])*?\*/', lines)

Returns: blanks [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']

Take 2: (added a grouping bracket around the regex)

cmts = re.findall(r'(/\*\*(.|[\r\n])*?\*/)', lines)

Returns

Single line comment (example only):

('/**\n\n * initialise the tag with the colors and the tag name\n\n */', ' ')

Multi line comment (example only):

('/**\n\n * Get the color related to a specified tag\n\n * @param tag the tag that we want to get the colour for\n\n * @return color of the tag in String\n\n */', ' ')

I am interested only in initialise the tag with the colors and the tag name or Get the color related to a specified tag, @param tag the tag that we want to get the colour for, @return color of the tag in String and am not able to get my head around it. Please give me some pointers!

1 Answer 1

2

To extract comments (everything between /** and */), you can use:

re.findall(r'\*\*(.*?)\*\/', text, re.S)

(note how capture group can be simplified if re.S/re.DOTALL is used, when dot matches also newlines).

Then, for each match you can strip multiple whitespace/*, and replace \n with ,:

def comments(text):
    for comment in re.findall(r'\*\*(.*?)\*\/', text, re.S):
        yield re.sub('\n+', ',', re.sub(r'[ *]+', ' ', comment).strip())

For example:

>>> list(comments('/**\n\n     * Get the color related to a specified tag\n\n     * @param tag the tag that we want to get the colour for\n\n     * @return color of the tag in String\n\n     */'))
['Get the color related to a specified tag, @param tag the tag that we want to get the colour for, @return color of the tag in String']
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.