Python: Get specific text in a line of a file using Regex

Question

I am using python to search through a text log file line by line and I want to save a certain part of a line as a variable. I am using Regex but don't think I am using it correctly as I am always get None for my variable string_I_want. I was looking at other Regex questions on here and saw people adding .group() to the end of their re.search but that gives me an error. I am not the most familiar with Regex but can't figure out where am I going wrong?

Sample log file:

2016-03-08 11:23:25  test_data:0317: m=string_I_want max_count: 17655, avg_size: 320, avg_rate: 165

My script:

def get_data(log_file):

    #Read file line by line
    with open(log_file) as f:
        f = f.readlines()

        for line in f:
            date = line[0:10]
            time = line[11:19]

            string_I_want=re.search(r'/m=\w*/g',line)

            print date, time, string_I_want

Don't just guess what those re functions and methods do --- read the "Regular Expression HOWTO" for a thorough introduction to using regular expressions in Python 2, and refer to the re reference docs when you need to look up specifics. It will save you time in the long run. — Kevin J. Chase
– Kevin J. Chase, Commented May 16, 2016 at 11:29

Wiktor Stribiżew · Accepted Answer · 2016-05-16 10:26:46Z

2

You need to remove the /.../ delimiters with the global flag, and use a capturing group:

mObj = re.search(r'm=(\w+)',line)
if mObj:
    string_I_want = mObj.group(1)

See this regex demo and the Python demo:

import re
p = r'm=(\w+)'              # Init the regex with a raw string literal (so, no need to use \\w, just \w is enough)
s = "2016-03-08 11:23:25  test_data:0317: m=string_I_want max_count: 17655, avg_size: 320, avg_rate: 165"
mObj = re.search(p, s)      # Execute a regex-based search
if mObj:                    # Check if we got a match
    print(mObj.group(1))    # DEMO: Print the Group 1 value

Pattern details:

m= - matches m= literal character sequence (add a space before or \b if a whole word must be matched)
(\w+) - Group 1 capturing 1+ alphanumeric or underscore characters. We can reference this value with the .group(1) method.

edited May 16, 2016 at 10:26

answered May 16, 2016 at 10:21

Wiktor Stribiżew

631k41 gold badges502 silver badges632 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

heemayl · Accepted Answer · 2016-05-16 10:22:47Z

0

Do:

(?<=\sm=)\S+

Example:

In [135]: s = '2016-03-08 11:23:25  test_data:0317: m=string_I_want max_count: 17655, avg_size: 320, avg_rate: 165'

In [136]: re.search(r'(?<=\sm=)\S+', s).group()
Out[136]: 'string_I_want'

answered May 16, 2016 at 10:22

heemayl

42.5k10 gold badges86 silver badges87 bronze badges

Comments

mR.aTA · Accepted Answer · 2016-05-16 10:33:49Z

0

Here is what you need:

import re
def get_data(logfile):
    f = open(logfile,"r")
    for line in f.readlines():
        s_i_w = re.search( r'(?<=\sm=)\S+', line).group()
        if s_i_w:
            print s_i_w
    f.close()

answered May 16, 2016 at 10:33

mR.aTA

3144 silver badges20 bronze badges

Collectives™ on Stack Overflow

Python: Get specific text in a line of a file using Regex

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related