1

I have a file with GC content like Total time for which application threads were stopped: 0.0017830 seconds, Stopping threads took: 0.0002897 seconds and many more similar lines. For troubleshooting purpose I need to extract the lines where stopped: will more than 1.x sec.

I did tried with grep 'stopped[: 1-9]*' but I am not much expert with regex usage. Could you please help me.

Thank you.

5 Answers 5

3

Would it not be easier to simply exclude those where the time was low?

grep 'stopped: ' | grep -v 'stopped: 0'
Sign up to request clarification or add additional context in comments.

1 Comment

aah! How I missed -v option.
2

try

 grep -E 'stopped: ([1-9]\.|[0-9]{2}\.)' file

to capture 10. as well.

or, better to factor out the common element and allow more digits than 2.

 grep -E 'stopped: ([1-9]|[0-9]{2,})\.' file

6 Comments

You just changed the way I should think to make this as perfect solution, :)
[1-9]\.|[0-9]{2}\.) , Here [1-9] is for first match and for decimal you have added dot(.) and escaped it with backslash() so it should consider as normal character and {2} for next two decimals . so 00 to 99. but why that last dot(.) and why you have escaped it ? I just cant link only this part to my question. Could you please explain. Thank you.
Yes, it will match 00. as but I didn't think it's a valid format to consider (unnecessary leading zeros). This simply says either its 1-9 before the decimal point or a two digit number before the decimal point. Assumes leading digit is a significant number without leading zeros. Note this won't capture 3 or more digits as well. I edited to include them.
@ccf I understand that , I am just trying to interpret the regex
The check is regex equivalent of greater than 1. Translation is: if it's a single digit it should be 1 or more; or, at least two digits before the decimal point.
|
1

I would recommend using egrep for this job which gives you more regex options.

Here's a starting point for a regex that may fit your use-case:

egrep "stopped: [0-9]+" data.txt

This will return any line that has stopped: in it followed by at least one number.

2 Comments

@Raja: But how it will filter results for values greater than 1.x sec
@anubhava I modified it as per my needs , echo " application threads were stopped: 7.0011040 seconds" | egrep "stopped: [1-9]+" and output application threads were stopped: 7.0011040 seconds
1

You can use gnu-awk using FPAT variable:

awk -v FPAT="stopped: *[0-9.]+" '{val=$1; sub(/.*: */, "", val)} val > 1' file

Using FPAT we're matching only stopped: *[0-9.]+ regex as a field. That gives us something like stopped: 1.1017830 in $1. Using sub function we remove everything before : and following space thus leaving only the number i.e. 1.1017830 in variable val.

Finally val > 1 will print rows where this number val is greater than 1.

3 Comments

Can you explain , its a bit hard to understand.
I've added some explanation in my answer.
Excellent Anubhava :)
0
grep -E 'Stopping.*[1-9][0-9]*\.[0-9]+' file

[1-9][0-9]*\.[0-9]+ is to make sure "more than 1.x sec"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.