Python Re: tested regex patterns returning no matches

Question

Using the python below I am trying to find specific headings with a markdown file

import re

source_file = open('readme.md')

#Defines variables containing the headings as Regex searches

product_name = re.compile('(?mi)^#+[^\S\n]product.*name')
product_version = re.compile('(?mi)^#+[^\S\n]product.*version')
product_description = re.compile('(?mi)^#+[^\S\n]product.*description')

#Adds the required headings to a list
headings = [product_name, product_version, product_description]

#Opens the readme, searches for the specified headings, then uses print_section to print the content under the headings
with open('readme.md') as file:
    for x in headings:
        content = x.search(file.read()).group(0)
        print(content)

The readme.md file contains the following placeholder text

# Product Name
Test Product
## Product Version
0.01
## Product Description
A short overview of what this product is; what it does; and what its intended use case is
# Specifications
## Networking
the quick brown fox jumped over the lazy dog
## Data Usage
This is line one of the sample
This is line two of the sample

The response I get from running this file is:

# Product Name
Traceback (most recent call last):
  File "sample.py", line 16, in <module>
    content = x.search(file.read()).group(0)
AttributeError: 'NoneType' object has no attribute 'group'

So I think there is an issue with the 2nd and 3rd Regex patterns. However in regex testers they appear to match properly. I can't really spot any differences between these patterns and the first which is successful, I've even tried swapping the ordering of the content in the markdown file, and still only product_name matches.

You really should be using raw strings (r'...' or r"...") for the regular expressions. I don't think it's actually causing a problem in this case, but regular string literals may interpret backslashes in a different way than the regex engine does. — jasonharper
– jasonharper, Commented May 20, 2020 at 12:19

ilyankou · Accepted Answer · 2020-05-20 12:11:49Z

1

Read the file only once, and search it:

with open('readme.md') as file:
    f = file.read()
    for x in headings:
        content = x.search(f).group(0)
        print(content)

As soon as Python read() reaches the end of the file (basically after your first iteration), it will keep returning an empty string. And you keep getting NoneType error because you search an empty string.

edited May 20, 2020 at 12:11

answered May 20, 2020 at 12:06

ilyankou

1,3298 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python Re: tested regex patterns returning no matches

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related