0

Here is a (real-world) text:

<tr>
randomtext
ip_(45.54.58.85)
randomtext..
port(randomtext45)
randomtext random...
</tr>
<tr>
randomtext ran
ip_(5.55.45.8)  
randomtext4
port(other$_text_other_length444)
</tr>
<tr>
randomtext
random
port(other$text52)
</tr>

output should be:

45.54.58.85 45

5.55.45.8 444

I know how to grep 45.54.58.85 and 5.55.45.8

awk 'BEGIN{ RS="<tr>"}1' file | grep -oP '(?<=ip_\()[^)]*'

how to grep port taking into account that we have random text/length after port( ?

I put a third record that should not appear in the output as there is no ip

1
  • hi Avinash; 45.54.58.85 45 Commented Jul 20, 2014 at 14:00

4 Answers 4

3

Using GNU Awk:

gawk 'BEGIN { RS = "<tr>" } match($0, /.*^ip_[(]([^)]+).*^port[(].*[^0-9]+([0-9]+)[)].*/, a) { print a[1], a[2] }' your_file

And another that's compatible with any Awk:

awk -F '[()]' '$1 == "<tr>" { i = 0 } $1 == "ip_" { i = $2 } $1 == "port" && i { sub(/.*[^0-9]/, "", $2); if (length($2)) print i, $2 }' your_file

Output:

45.54.58.85 45
5.55.45.8 444
Sign up to request clarification or add additional context in comments.

3 Comments

thanx Konsolebox, we 've got only part of port info in your output
@blue_xylo I got a new version. I'll fix the old one.
+1, but this will fail to print the port if it happens to be 0.
0

Through GNU awk , grep and paste.

$ awk 'BEGIN{ RS="<tr>"}/ip_/{print;}' file | grep -oP 'ip_\(\K[^)]*|port\(\D*\K\d+' | paste - -
45.54.58.85 45
5.55.45.8   444

Explanation:

  • awk 'BEGIN{ RS="<tr>"}/ip_/{print;}' file with the Record Separator value as <tr>, this awk command prints only the record which contains the string ip_
  • ip_\(\K[^)]* prints only the text which was just after to ip_( upto the next ) symbol. \K in the pattern discards the previously matched characters.
  • | Logical OR symbol.
  • port\(\D*\K\d+ Prints only the numbers which was inside port() string.
  • paste - - combine every two lines.

5 Comments

pretty cool Konsolebox indeed and many thanx to you but the question was 'how to grep...', Avinash rocks again! Avinash,if possible, could you give a bit more details about your solutions
@blue_xylo sure just wait for some mins.
grep can never give the solution on its own and awk does the significant work. Actually, awk can work on its own and grep is only a redundancy but it's ok no worries :) Welcome.
The question may have been "how to grep", but the correct answer is "grep is not a verb, and in this case it's not the best tool".
Actually grep is a verb, it's an abbreviation for Globally find a Regular Expression and Print which is written g/re/p in ed. There is also a tool by the same name but that tools name came after the desired action and is just one approach to grep-ing :-). You're right, though, the tool grep has no useful part to play in solving this problem.
0

Here is another awk

awk -F"[()]" '/^ip/ {ip=$2;f=NR} f && NR==f+2 {n=split($2,a,"[a-z]+");print ip,a[n]}' file
45.54.58.85 45
5.55.45.8 444

How it works:

awk -F"[()]" '              # Set field separator to "()"
/^ip/ {                     # If line starts with "ip" do
    ip=$2                   # Set "ip" to field $2
    f=NR}                   # Set "f" to line number
f && NR==f+2 {              # Go two line down and
    n=split($2,a,"[a-z]+")  # Split second part to get port
    print ip,a[n]           # Print "ip" and "port"
    }' file                 # Read the file

Comments

0

WIth any modern awk:

$ awk -F'[()]' '
    $1=="ip_"   { ip=$2 }
    $1=="port"  { sub(/.*[^[:digit:]]/,"",$2); port=$2 }
    $1=="</tr>" { if (ip) print ip, port; ip="" }
' file
45.54.58.85 45
5.55.45.8 444

Couldn't be much simpler and clearer IMHO.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.