Using the python below I am trying to find specific headings with a markdown file
import re
source_file = open('readme.md')
#Defines variables containing the headings as Regex searches
product_name = re.compile('(?mi)^#+[^\S\n]product.*name')
product_version = re.compile('(?mi)^#+[^\S\n]product.*version')
product_description = re.compile('(?mi)^#+[^\S\n]product.*description')
#Adds the required headings to a list
headings = [product_name, product_version, product_description]
#Opens the readme, searches for the specified headings, then uses print_section to print the content under the headings
with open('readme.md') as file:
for x in headings:
content = x.search(file.read()).group(0)
print(content)
The readme.md file contains the following placeholder text
# Product Name
Test Product
## Product Version
0.01
## Product Description
A short overview of what this product is; what it does; and what its intended use case is
# Specifications
## Networking
the quick brown fox jumped over the lazy dog
## Data Usage
This is line one of the sample
This is line two of the sample
The response I get from running this file is:
# Product Name
Traceback (most recent call last):
File "sample.py", line 16, in <module>
content = x.search(file.read()).group(0)
AttributeError: 'NoneType' object has no attribute 'group'
So I think there is an issue with the 2nd and 3rd Regex patterns. However in regex testers they appear to match properly. I can't really spot any differences between these patterns and the first which is successful, I've even tried swapping the ordering of the content in the markdown file, and still only product_name matches.
r'...'orr"...") for the regular expressions. I don't think it's actually causing a problem in this case, but regular string literals may interpret backslashes in a different way than the regex engine does.