1

Using the python below I am trying to find specific headings with a markdown file

import re

source_file = open('readme.md')

#Defines variables containing the headings as Regex searches

product_name = re.compile('(?mi)^#+[^\S\n]product.*name')
product_version = re.compile('(?mi)^#+[^\S\n]product.*version')
product_description = re.compile('(?mi)^#+[^\S\n]product.*description')

#Adds the required headings to a list
headings = [product_name, product_version, product_description]

#Opens the readme, searches for the specified headings, then uses print_section to print the content under the headings
with open('readme.md') as file:
    for x in headings:
        content = x.search(file.read()).group(0)
        print(content)

The readme.md file contains the following placeholder text

# Product Name
Test Product
## Product Version
0.01
## Product Description
A short overview of what this product is; what it does; and what its intended use case is
# Specifications
## Networking
the quick brown fox jumped over the lazy dog
## Data Usage
This is line one of the sample
This is line two of the sample

The response I get from running this file is:

# Product Name
Traceback (most recent call last):
  File "sample.py", line 16, in <module>
    content = x.search(file.read()).group(0)
AttributeError: 'NoneType' object has no attribute 'group'

So I think there is an issue with the 2nd and 3rd Regex patterns. However in regex testers they appear to match properly. I can't really spot any differences between these patterns and the first which is successful, I've even tried swapping the ordering of the content in the markdown file, and still only product_name matches.

1
  • You really should be using raw strings (r'...' or r"...") for the regular expressions. I don't think it's actually causing a problem in this case, but regular string literals may interpret backslashes in a different way than the regex engine does. Commented May 20, 2020 at 12:19

1 Answer 1

1

Read the file only once, and search it:

with open('readme.md') as file:
    f = file.read()
    for x in headings:
        content = x.search(f).group(0)
        print(content)

As soon as Python read() reaches the end of the file (basically after your first iteration), it will keep returning an empty string. And you keep getting NoneType error because you search an empty string.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.