0

I'm using

\[(.*?)\]|Response code (?P<code>\d+)

to search these fields:

[2018-01-20 05:19:54.812] INFO    com.mulesoft.ch.monitoring.MonitoringCoreExtension [qtp689806602-32]: Monitoring enabled: true
[2018-01-20 05:19:54.813] INFO    com.mulesoft.ch.monitoring.MonitoringCoreExtension [qtp689806602-32]: Registering ping flow injector...
[2018-01-20 05:19:54.833] INFO    com.mulesoft.ch.queue.boot.PersistentQueueCoreExtension [qtp689806602-32]: The PersistentQueueManager is NOT configured. The normal VM queue manager will be used.
[2018-01-20 05:19:54.841] INFO    org.mule.lifecycle.AbstractLifecycleManager [qtp689806602-32]: Initialising RegistryBroker
[2018-01-20 05:19:54.872] INFO
[2018-01-24 02:14:30.153] INFO    org.mule.routing.SynchronousUntilSuccessfulProcessingStrategy [[swt-fastsalescomp-anaplan-schedules].ScatterGatherWorkManager.24]: Exception thrown inside until-successful org.mule.module.http.internal.request.ResponseValidatorException: Response code 503 mapped as failure.

But I only want it to match the dates, not the other stuff that's between brackets as well as assign a named group 'code'(that parts working). I tried several variations including

\[(\d*?)\]
\[(\W*?)\]
\[^(\.*?){23}$\]

But I can't seem to get it to find anything with those criteria.

Bonus: I might be able to figure this one out once the rest is solved, but I might as well ask while I'm in here. How do I update a dictionary with the date and code as a key value pair?

3
  • Only dates, not the time? Commented Feb 12, 2018 at 0:21
  • 2
    See this approach. Commented Feb 12, 2018 at 0:28
  • @WiktorStribiżew you can add a ^ for the first part of your regex to impose that it starts at the beginning of the line for the timestamp!!! otherwise great as usual ;-)' Commented Feb 12, 2018 at 0:30

1 Answer 1

1

Regex: \d{4}(?:-\d{2}){2}[^]]+|(?<=Response code )(?P<code>\d+)

Details:

  • (?:) Non-capturing group
  • {n} Matches exactly n times
  • [^] Named Capture Group
  • | Or
  • (?<=) Positive Lookbehind
  • (?P<>) Named Capture Group

Python code:

for match in re.finditer(r'\d{4}(?:-\d{2}){2}[^]]+|(?<=Response code )(?P<code>\d+)', text):
    print(match.group())

Output:

2018-01-20 05:19:54.812
2018-01-20 05:19:54.813
2018-01-20 05:19:54.833
2018-01-20 05:19:54.841
2018-01-20 05:19:54.872
2018-01-24 02:14:30.153
503

Code demo

Sign up to request clarification or add additional context in comments.

4 Comments

As a followup, how would I only return time values that have a 503 with them?
@Cdhippen You mean 02:14:30.153?
Yes. My log has literal thousands of time entries, I'm trying to pull out only lines with time and a status code.
changed it to this: pastebin.com/9wNsiU6u and now I have exactly what I want (I don't need the milliseconds, that will just complicate things since I'm going to try and put it in matplotlib) Thanks a ton for the help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.