How to fetch a particular string using a sed command

Question

I have an input string like below:

   VAL:1|b:2|c:3|VAL:<har:[email protected]>; tag=vy6r5BpcvQ|VAl:1234|name:mnp|VAL:91987654321

Like this, there are more than 1000 rows.

I want to fetch the value of the first parameter, i.e., the a field and d field, but for the d field I want only har:[email protected].

I tried like this:

cat $filename | grep -v Orig |sed -e 's/['a:','d:']//g' |awk -F'|' -v OFS=',' '{print $1 "," $4}' >> $NGW_DATA_FILE

The output I got is below:

1,<[email protected]>; tag=vy6r5BpcvQ

I want it like this,

1,har:[email protected]

Where did I make the mistake and how do I solve it?

If you are using Awk anyway, do all of this processing in Awk. — tripleee
– tripleee, Commented Dec 11, 2020 at 7:41
sometime I may receive value in a same field name like val: , instead of a and d, then how can I fetch its values — mark
– mark, Commented Dec 11, 2020 at 7:44

RavinderSingh13 · Accepted Answer · 2020-12-11 08:22:55Z

5

EDIT: As per OP's change of Input_file and OP's comments, adding following now.

awk '
BEGIN{ FS="|"; OFS="," }
{
  sub(/[^:]*:/,"",$1)
  gsub(/^[^<]*|; .*/,"",$4)
  gsub(/^<|>$/,"",$4)
  print $1,$4
}'  Input_file

With shown samples, could you please try following, written and tested with shown samples in GNU awk.

awk '
BEGIN{
  FS="|"
  OFS=","
}
{
  val=""
  for(i=1;i<=NF;i++){
    split($i,arr,":")
    if(arr[1]=="a" || arr[1]=="d"){
      gsub(/^[^:]*:|; .*/,"",$i)
      gsub(/^<|>$/,"",$i)
      val=(val?val OFS:"")$i
    }
  }
  print val
}
' Input_file

Explanation: Adding detailed explanation for above.

awk '                                ##Starting awk program from here.
BEGIN{                               ##Starting BEGIN section of this program from here.
  FS="|"                             ##Setting FS as pipe here.
  OFS=","                            ##Setting OFS as comma here.
}
{
  val=""                             ##Nullify val here(to avoid conflicts of its value later).
  for(i=1;i<=NF;i++){                ##Traversing through all fields here
    split($i,arr,":")                ##Splitting current field into arr with delimiter by :
    if(arr[1]=="a" || arr[1]=="d"){  ##Checking condition if first element of arr is either a OR d
      gsub(/^[^:]*:|; .*/,"",$i)     ##Globally substituting from starting till 1st occurrence of colon OR from semi colon to everything with NULL in $i.
      val=(val?val OFS:"")$i         ##Creating variable val which has current field value and keep adding in it.
    }
  }
  print val                          ##printing val here.
}
' Input_file                         ##Mentioning Input_file name here.

edited Dec 11, 2020 at 8:22

answered Dec 11, 2020 at 7:33

RavinderSingh13

135k14 gold badges61 silver badges100 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

mark Over a year ago

Hi Sir, thanks for the answer it works for the case, I have one doubt what if instead of A and D, I have got in the same field name like both the values are present in D:1|D:<har:[email protected]>; tag=vy6r5BpcvQ like this then what I have to alter , as the file may contain huge data

RavinderSingh13 Over a year ago

@mark, sorry but this is not clear, please do give more clear example on this one.

mark Over a year ago

ok, if the case is like this input : -- a:1|b:2|c:3|a:<har:[email protected]>; tag=vy6r5BpcvQ| if the key is 'a' for both the first and 4 th entry

RavinderSingh13 Over a year ago

@mark, ok so what will be the condition then to print these values then? Kindly confirm once.

mark Over a year ago

Yes it worked, I have made small change gsub(/^[^<]*|; .*/,"",$1) sub(/[^:]*:/,"",$4) it worked with this one also, Thank you sir

|

Peter Mortensen · Accepted Answer · 2020-12-11 17:39:15Z

4

You may also try this AWK script:

cat file

VAL:1|b:2|c:3|VAL:<har:[email protected]>; tag=vy6r5BpcvQ|VAl:1234|name:mnp|VAL:91987654321

awk -F '[|;]' '{
   s=""
   for (i=1; i<=NF; ++i)
      if ($i ~ /^VAL:/) {
         gsub(/^[^:]+:|[<>]*/, "", $i)
         s = (s == "" ? "" : s "," ) $i
      }
   print s
}' file

1,har:[email protected]

edited Dec 11, 2020 at 17:39

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Dec 11, 2020 at 7:49

anubhava

790k67 gold badges603 silver badges671 bronze badges

2 Comments

mark Over a year ago

Hi sir, thanks for the solution, here accessing the a and d fields and do the filtering, I have one doubt what if instead of 2 different fields a and d , the input changed to a:1|b:2|c:3|a:<har:[email protected]>; tag=vy6r5BpcvQ like this then how to filter it, as there maybe multiple fields whose name start with same field like d

Ed Morton Over a year ago

@mark wrt I have one doubt ... - what you have is a "question", not a "doubt". A doubt means you don't believe something you've been told, a question just means you'd like information about something. It's a common mistake in the English spoken in India, see can-doubt-sometimes-mean-question. No big deal of course, just thought you'd like to know.

David C. Rankin · Accepted Answer · 2020-12-11 08:35:52Z

3

You can do the same thing with sed rather easily using Extended Regex, two capture groups and two back-references, e.g.

sed -E 's/^[^:]*:(\w+)[^<]*[<]([^>]+).*$/\1,\2/'

Explanation

's/find/replace/' standard substitution, where the find is;
^[^:]*: from the beginning skip through the first ':', then
(\w+) capture one or more word characters ([a-zA-Z0-9_]), then
[^<]*[<] consume zero or more characters not a '<', then the '<', then
([^>]+) capture everything not a '>', and
.*$ discard all remaining chars in line, then the replace is
\1,\2 reinsert the captured groups separated by a comma.

Example Use/Output

$ echo 'a:1|b:2|c:3|d:<har:[email protected]>; tag=vy6r5BpcvQ|' | 
sed -E 's/^[^:]*:(\w+)[^<]*[<]([^>]+).*$/\1,\2/'
1,har:[email protected]

edited Dec 11, 2020 at 8:35

answered Dec 11, 2020 at 8:14

David C. Rankin

85.1k6 gold badges67 silver badges95 bronze badges

Collectives™ on Stack Overflow

How to fetch a particular string using a sed command

3 Answers 3

9 Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

9 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related