2

I want to extract all docstrings from my python file using grep or awk. I tried

cat test.py | grep """[\w\W]*?"""

But I see no output. Say the the test test.py looks like this.

import libraries

class MyClass(object):
    """Docstring to this class. 
       second line of docstring."""

    def myClassMethod(a,b):
        """Docstring of the method. 
           another line in docstring of the method."""
        return a + b

Then the output should be all that is enclosed in triple quotes.

"""Docstring to this class. 
second line of docstring."""
"""Docstring of the method. 
another line in docstring of the method."""
2
  • We're gonna need more details, like input sample and expected output. Commented Oct 27, 2017 at 10:00
  • I added the test case and fixed typos. Commented Oct 27, 2017 at 10:06

2 Answers 2

2

The proper way to extract docstrings from Python code is via actual Python parser (the ast module):

#!/usr/bin/env python
import ast

with open('/path/to/file') as f:
    code = ast.parse(f.read())

for node in ast.walk(code):
    if isinstance(node, (ast.FunctionDef, ast.ClassDef, ast.Module)):
        docstring = ast.get_docstring(node)
        if docstring:
            print(repr(docstring))

Run on your sample will output:

'Docstring to this class. \nsecond line of docstring.'
'Docstring of the method. \nanother line in docstring of the method.'

Just for fun, we can do also do it with GNU awk:

$ awk -v RS= -v FPAT="'''.*'''|"'""".*"""' '{print $1}' file
"""Docstring to this class. 
       second line of docstring."""
"""Docstring of the method. 
           another line in docstring of the method."""
Sign up to request clarification or add additional context in comments.

3 Comments

I had found this method, but I was wondering if this could be done using grep or awk
I wouldn't recommend it over Python version, but take a look at the awk approach I just added.
Thanks a lot this satisfied my curiosity
1

With P(perl) grep you can do the following:

grep -Poz '"""[^"]+"""' test.py

Output:

"""Docstring to this class. 
       second line of docstring.""""""Docstring of the method. 
           another line in docstring of the method."""

2 Comments

I am getting an error grep: unescaped ^ or $ not supported with -Pz
Mine is grep (GNU grep) 2.24 if we use something other than perl regex does it work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.