1

i have case to splitting text like sample

A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0
AE|B1|CC|DE| |EX|FF|0
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G|

I need the text to be like this

A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|3|1|1
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|1|4|4
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|5|1|4
AE|B1|CC|DE| |EX|FF|0|||
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||5|6|3
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||4|3|4

i already try using

awk 'BEGIN{FS=OFS="|"} {split($5,a,/;/); for (i in a) {if (a[i]) $9=a[i]; else next; gsub(/#/,"|",$9); print}}

however the if the $5 is having space only, it wont adding the column.

2 Answers 2

2

Using any awk:

$ cat tst.awk
{
    tgt = ( /;/ ? $2 : "###;")
    gsub(/#/,"|",tgt)
    n = split(tgt,a,/\|?;/)
    for ( i=1; i<n; i++ ) {
        print $0 "|" a[i]
    }
}

$ awk -f tst.awk file
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|3|1|1
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|1|4|4
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|5|1|4
AE|B1|CC|DE| |EX|FF|0|||
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||5|6|3
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||4|3|4
Sign up to request clarification or add additional context in comments.

1 Comment

defaulting with "###;" saves a lot of keystrokes, nice!
2

1st solution: With your shown samples please try following awk code.

awk '
match($0,/(([0-9]+#)+);[^|]*/){
  num=split(substr($0,RSTART,RLENGTH),arr,";")
  for(i=1;i<num;i++){
    sub(/#$/,"",arr[i])
    gsub(/#/,"|",arr[i])
    print $0"|"arr[i]
  }
  next
}
{
  print $0 "|||"
}
'   Input_file


2nd solution: Using function approach in awk, with your shown samples please try following awk code. We can pass number of fields into function what we want to look and get values for, eg: in this case I am passing 2nd, 3rd and 4th field numbers into the function to work on them to get required output. But in case you have too many fields then I suggest use the 1st solution of mine shown above.

awk -F' |;' '
function getValues(fields){
  num=split(fields,arr,",")
  for(i=1;i<=num;i++){
    if($arr[i]~/^([0-9]+#)+[0-9]*$/){
      val=$arr[i]
      sub(/#$/,"",val)
      gsub(/#/,"|",val)
      print $0"|"val
    }
  }
}
/([0-9]+#)+;/{
  getValues("2,3,4")
  next
}
{
  print $0 "|||"
}
'   Input_file

In both the solutions, output will be as follows:

A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|3|1|1
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|1|4|4
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|5|1|4
AE|B1|CC|DE| |EX|FF|0|||
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||5|6|3
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||4|3|4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.