0

I have a line from a text file and I'm trying to create a regular expression to match. This is the line of text.

    2015-01-07 Wed Jan 07 11:03:43.390 Some text here..

My regular expression to match is as follows:

    (?<date>(?<year>(?:\d{4}|\d{2})-(?<month>\d{1,2})-(?<day>\d{1,2})))\s(?<txtEntry1>.*)\s(?<txtEntry2>.*)\s(?<txtEntry3>.*)\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2}):(?<milli>\d{0,3}))\s(?<txtEntry4>.*)\s(?<txtEntry5>.*))

It doesn't match. I'm not concerned about the 'worded' date Wed Jan 07 so I have just left it as a text entry, rather than match it yo to dd/mm/yy. I have been trying to figure it our but with no success. Can anyone see where I have gone wrong?

4
  • It doesn't match because you don't have a pattern for month Commented Apr 29, 2015 at 10:46
  • Thank you @nhahtdh that is true! Well spotted. Don't know how I missed that. I have edited the question to reflect this. Thank you.Unfortunately it still doesn't match. Commented Apr 29, 2015 at 10:50
  • 1
    Note that \s(?<txtEntry1>.*)\s(?<txtEntry2>.*)\s(?<txtEntry3>.*) is a terrible idea, since it cause a lot of unnecessary backtracking. Consider at least using \S* in place of .* - if you know the number of non-space tokens before hand. For the trailing \s(?<txtEntry4>.*)\s(?<txtEntry5>.*), I don't know why you need 2 of them, but if you only need the whole text, you can just use one of them \s(?<txtEntry4>.*) to capture the rest of the string. Commented Apr 29, 2015 at 10:56
  • Hi. I need to split the string after, so I need to be able to target each entry individually and the values within the 'txtEnrty' will be different every time. It's working for me now though, so thank you all for your help. Commented Apr 29, 2015 at 11:03

2 Answers 2

2

There are 2 problems with your regular expression

  1. There is no pattern specified for the capture group month (now updated)
  2. You have used a colon, instead of a period for the separator between second and millisecond (?<seconds>\d{2}):(?<milli>\d{0,3}))
Sign up to request clarification or add additional context in comments.

4 Comments

Another odd thing is that the milliseconds might be missing \d{0,3}, but the colon (or point after the fix) must be present.
Yes that's it too, the dot between second and milli! Thank you for your sharp eyes. :-)
@NepSyn14 - It wasnt sharp eyes, it was expresso turning your complex regex into words that I could read through ;) Highly recommended if you're writing anything but the simplest of regexes
expresso? I will look into it. I was using regex101.com but failed to spot those simple little errors. Thank you.
1

This works for me:

(?<date>(?<year>(?:\d{4}|\d{2}))-(?<month>\d{1,2})-(?<day>\d{1,2}))\s(?<txtEntry1>\S*)\s(?<txtEntry2>\S*)\s(?<txtEntry3>\S*)\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})\.(?<milli>\d{0,3}))\s(?<txtEntry4>.*)

not sure about your textentry5 though

Found 1 match:
2015-01-07 Wed Jan 07 11:03:43.390 Some text here.. has 13 groups:
2015-01-07 (date)
2015 (year)
01 (month)
07 (day)
Wed (txtEntry1)
Jan (txtEntry2)
07 (txtEntry3)
11:03:43.390 (time)
11 (hour)
03 (minutes)
43 (seconds)
390 (milli)
Some text here.. (txtEntry4)
String literals for use in programs:
C#
@"(?<date>(?<year>(?:\d{4}|\d{2}))-(?<month>\d{1,2})-(?<day>\d{1,2}))\s(?<txtEntry1>\S*)\s(?<txtEntry2>\S*)\s(?<txtEntry3>\S*)\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})\.(?<milli>\d{0,3}))\s(?<txtEntry4>.*)"

2 Comments

Gee, you think specifying what you've changed might be of some help?
@Jamiec just fixed brackets and switched . to \S (non space) and fixed the issue with the milliseconds. you allready mentioned...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.