2

Now I have the code to work on this file type: cat myfile.txt

XSAP_SM1_100 COR-REV-SAPQ-P09 - 10/14/2013 -
SCHEDULE XSAP_SM1_100#COR-REV-SAPQ-P09 TIMEZONE Europe/Paris
ON RUNCYCLE RULE1 "FREQ=WEEKLY;BYDAY=WE"
EXCEPT RUNCYCLE CALENDAR2 FR1DOFF -1 DAYS
EXCEPT RUNCYCLE SIMPLE3 11/11/2011
AT 0530
:
XSAP_SM1_100#CORREVSAPQP09-01
AT 0640 TIMEZONE Europe/Paris
XSAP_SM1_100#CORREVSAPQP09-02
AT 0645 TIMEZONE Europe/Paris

Code is

awk 'BEGIN { RS=":"; FS="\n"}
    NR==2 {
        for(i=1;i<=NF;++i) {
            if($i !~ /^$/) {
                split($i,tmp,"#")
                i=i+1
                split($i,tmp2," ")
                printf "\"%s\",\"%s\",\"%s\"\n", tmp[1],tmp[2],tmp2[2]
            }
        }
    }'

But I have another file type i.e.I'll be executing this command to 1000s of files in for loop but as of I have consolidated and only for below type it's not working as expected.

] cat testing.txt
ODSSLT_P09 COR-ODS-SMT9-B01 - 12/29/2015 -
SCHEDULE ODSSLT_P09#COR-ODS-SMT9-B01 TIMEZONE UTC
ON RUNCYCLE RULE1 "FREQ=DAILY;"
AT 0505
PRIORITY 11
:
ODSSLT_P09#CORODSSMT9001-01
UNTIL 2355 TIMEZONE Asia/Shanghai
EVERY 0100
ODSSLT_P09#CORODSSMT9001-02
AT 2355
EVERY 0100
ODSSLT_P09#CORODSSMT9001-03
ODSSLT_P09#CORODSSMT9001-04
UNTIL 2355 TIMEZONE Asia/Shanghai
EVERY 0100

EOF

Expected output for this file:

"ODSSLT_P09","CORODSSMT9001-01",""
"ODSSLT_P09","CORODSSMT9001-02","2355"
"ODSSLT_P09","CORODSSMT9001-03",""
"ODSSLT_P09","CORODSSMT9001-04",""

Actual output from the code is

| grep -v -i -w -E 
"CONFIRMED|DEADLINE|DAY|DAYS|EVERY|NEEDS|OPENS|PRIORITY|PROMPT|UNTIL|AWSBIA291I|END|FOLLOWS" |
awk 'BEGIN { RS=":"; FS="\n"}
NR==2 {for(i=1;i<=NF;++i) {
  if($i !~ /^$/) {
    split($i,tmp,"#")
    i=i+1
    split($i,tmp2," ")
    printf "\"%s\",\"%s\",\"%s\"\n", tmp[1],tmp[2],tmp2[2]
}}}'

output just gives:

"ODSSLT_P09","CORODSSMT9001-01",""
"AT 2355","",""
"ODSSLT_P09","CORODSSMT9001-04",""
3
  • ksh is great, but you should be able to do all of this in one (maybe two) awk programs. Why complicate it? Or if you really want to use ksh, AND you can rely that your data after : will be in two-line pairs, you can insert another step after your first awk that "folds" each 2 lines onto one line, then you can just use awk '{print $1, $3}' | awk -F# '{print $0}' | awk 'printf(...) (using of course, a more rigourous printf("\"%s\"....) function to get your CSV data in order. Good luck. Commented Feb 13, 2016 at 18:22
  • Thank you, I have used this below Commented Feb 20, 2016 at 13:42
  • It would help if you would highlight the patterns that are relevant for the desited output. I now guess that column 3 of that output should match the value after the string "UNTIL" if that would be present? Don't leave us guessing, but explain what you need exactly. A bit of explanation of the domain would help - are these flight or freight schedules? Commented Feb 20, 2016 at 15:29

2 Answers 2

1

The best solution would be a small awk program doing everything (awk will loop through the input, so write something without a while).
Since you have tagged with ksh and not bash or linux, I do not trust your version of awk.
First try joining the lines and split again except for the AT. I hope no lines will have the string EOL, so I will join with an EOL marker.

   sed 's/$/EOL/' myfile.txt |
   tr -d "\n" |
   sed -e 's/EOLAT/ AT/g' -e 's/EOL/\n/g'

Perhaps your sed version will not understand the \n, in that case replace it with a real newline.
I know what I want to do with the sed output, so I will filter before sed and change the sed commands.

foundcolon="0";
grep -E "^:$|XSAP|AT" myfile.txt |
   sed 's/$/EOL/' |
   tr -d "\n" |
   sed -e 's/EOLAT//g' -e 's/EOL/\n/g' -e 's/#/ /g' |
   while read -r xsap corr numm rest_of_line; do
      if [ "${foundcolon}" = "0"  ]; then
         if [ "${xsap}" = ":" ]; then
            foundcolon="1"
         fi
         continue
      fi
      printf '"%s","%s","%s"\n' "${xsap}" "${corr}" "${numm}";
   done

Using another sed option, sed -e '/address1/,/address2/ d' will make it even more simple:

grep -E "^:$|XSAP|AT" myfile.txt |
   sed 's/$/EOL/' |
   tr -d "\n" |
   sed -e 's/EOLAT//g' -e 's/EOL/\n/g' -e '1,/^:$/ d' -e 's/#/ /g' |
   while read -r xsap corr numm rest_of_line; do
      printf '"%s","%s","%s"\n' "${xsap}" "${corr}" "${numm}";
   done
Sign up to request clarification or add additional context in comments.

Comments

1

Here's a more or less pure awk solution, which produces literally the requested output for the given input file. It suffers from having no knowledge of the problem domain.

awk '
/^:/ { start=1; next }
! start {next}
$1 == "AT" {
  split(last,a,/#/)
  printf "\"%s\",\"%s\",\"%s\"\n", a[1], a[2], $2
  last=""
  next
}
{
  last=$0
}' data

1 Comment

Thanks Henk, it worked, but it gives only the values of AT, but I would require the null values as well for the fields seperated by #. Please let me know if you could find a way on this. Thanking in anticipation!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.