Regex match a pattern occurring multiple times in a string

Question

Using grepin Ubuntu, I am trying to regex match a pattern that is repeated multiple times in a line.

Example: 0:0, 80:3, 443:0, 8883:0, 9000:0, 9001:0,

The regex I tried is -

([0-9]+:[0-9]+, )+

But it only matches upto -

0:0, 80:3, 443:0, 8883:0, 9000:0,

I would want it to match the complete line. Also, I'd appreciate if the regex will check if there is a presence of 80 and 443 in the matched string.

Expectation -

The following lines should be matched -

0:0, 80:3, 443:0, 8883:0, 9000:0, 9001:0,
0:0, 80:0, 443:1, 8883:0, 9000:0, 9001:0,
0:0, 80:0, 443:0, 8883:0, 9000:0, 9001:0,
0:0, 80:3, 443:1, 8883:0, 9000:0, 9001:0,

and the ones below should not be matched -

0:0, 443:0, 8883:0, 9000:0, 9001:0,
0:0, 80:0, 8883:0, 9000:0, 9001:0,
0:0, 8883:0, 9000:0, 9001:0,

Please add your desired output for that sample input to your question (no comment). — Cyrus
– Cyrus, Commented Feb 2, 2022 at 19:21
It's not matching the last term because there's no space on the end of your input string, but your regex requires one. If the spaces are optional, you could just use " *" instead of " " in the regex. If not, then you need to match "either space or end of line", which would be "( |$)" instead of the " ", or, if you're going to examine the match results and don't want to capture the spaces, you can use a non-capturing group, "(?: |$)". — Kevin Perry
– Kevin Perry, Commented Feb 2, 2022 at 19:45

The fourth bird · Accepted Answer · 2022-02-02 22:30:49Z

2

You can use

^[0-9]+:[0-9]+, 80:[0-9]+, 443:[0-9]+(, [0-9]+:[0-9]+)+,$

See the regex demo.

Also, consider the awk solution like

awk '/^[0-9]+:[0-9]+(, [0-9]+:[0-9]+)+,$/ && /80/ && /443/' file

See the online demo:

#!/bin/bash
s='0:0, 80:3, 443:0, 8883:0, 9000:0, 9001:0,
0:0, 80:0, 443:1, 8883:0, 9000:0, 9001:0,
0:0, 80:0, 443:0, 8883:0, 9000:0, 9001:0,
0:0, 80:3, 443:1, 8883:0, 9000:0, 9001:0,
0:0, 443:0, 8883:0, 9000:0, 9001:0,
0:0, 80:0, 8883:0, 9000:0, 9001:0,
0:0, 8883:0, 9000:0, 9001:0,'
awk '/^[0-9]+:[0-9]+(, [0-9]+:[0-9]+)+,$/ && /80/ && /443/' <<< "$s"

Output:

0:0, 80:3, 443:0, 8883:0, 9000:0, 9001:0,
0:0, 80:0, 443:1, 8883:0, 9000:0, 9001:0,
0:0, 80:0, 443:0, 8883:0, 9000:0, 9001:0,
0:0, 80:3, 443:1, 8883:0, 9000:0, 9001:0,

edited Feb 2, 2022 at 22:30

The fourth bird

165k16 gold badges61 silver badges75 bronze badges

answered Feb 2, 2022 at 20:23

Wiktor Stribiżew

631k41 gold badges502 silver badges632 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ira Over a year ago

Thanks Wiktor - Can the regex be changed so that the lines with 80:3, 443:0, can be matched irrespective of their positions? Eg. Both 0:0, 80:0, 443:1, 8883:0, and 0:0, 443:1, 80:0, 8883:0, are matched.

Wiktor Stribiżew Over a year ago

@Ira This is what my awk does, see awk '/^[0-9]+:[0-9]+(, [0-9]+:[0-9]+)+,$/ && /80/ && /443/' file

RavinderSingh13 · Accepted Answer · 2022-02-03 01:52:21Z

Here is more robust awk pattern match, which is as per your shown samples, written and tested in GNU awk, should work in any awk. Simple explanation of awk code would be: awk works on method of condition/regexp then action, so I am mentioning condition/regexp here with NO action so if regexp is TRUE(matched) then by default printing of line will happen.

awk '/^0:[0-9],[[:space:]]+80:[0-9],[[:space:]]+443:[0-9],[[:space:]]+8883:[0-9](,[[:space:]]+9[0-9]{3}:[0-9]){2},$/' Input_file

Explanation: Adding detailed explanation for above regex.

^0:[0-9],[[:space:]]+             ##From starting of line matching 0 followed by colon followed by comma, followed y 0 OR 1 occurrences of space(s).
80:[0-9],[[:space:]]+             ##Above regex is followed by 80 colon any digit comma and space(s).
443:[0-9],[[:space:]]+            ##Above is followed by 443 colon digit comma and space(s).
8883:[0-9]                        ##Above is followed by 8883 followed by colon followed by any digit.
(,[[:space:]]+9[0-9]{3}:[0-9]){2} ##matching comma space(s) followed by 9 which is followed by 3 digits and this whole match 2 times(to match last 2 9000 etc values).
,$                                ##Matching comma at the end of line here.

Collectives™ on Stack Overflow

Regex match a pattern occurring multiple times in a string

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related