1

I have a data file with comma-separated fields:

379565,COFFEE,297678,      ,21,21,I, 6,  10.00,               ,     ,                            ,01-DISPLAY REVENUE  ,17-HOUSE ACCOUNT    ,AD#05260540         ,YES               ,N,N,20210625,
380685,COMICS,297634,      ,21,21,I, 3,  21.00,MAIN NEWS      ,     ,BATHS                       ,01-DISPLAY REVENUE  ,17-HOUSE ACCOUNT    ,AD# IS 05240526     ,YES               ,N,N,20210625,
337708,COMICS,298047, 84558,21,21,I, 6,  21.00,               ,     ,SCHOOL PAGE                 ,01-DISPLAY REVENUE  ,17-HOUSE ACCOUNT    ,                    ,CMYK              ,N,N,20210625

When column 4 only has spaces, the 8-digit ad number needs to be pulled from column 15.

This awk checks to see if column 4 is only spaces and, if so, copies column 15 to 4:

awk -F, '{ if ($4 ~ /^[[:space:]][[:space:]][[:space:]][[:space:]][[:space:]][[:space:]]/) {OFS=",";{$4=$15} print} else print}'

How can I extract just the 8-digit ad number (without the "AD#" or "AD# IS" parts) from column 15 and put into column 4?

Expected outcome:

379565,COFFEE,297678,05260540,21,21,I, 6,  10.00,               ,     ,                            ,01-DISPLAY REVENUE  ,17-HOUSE ACCOUNT    ,AD#05260540         ,YES               ,N,N,20210625,
380685,COMICS,297634,05240526,21,21,I, 3,  21.00,MAIN NEWS      ,     ,BATHS                       ,01-DISPLAY REVENUE  ,17-HOUSE ACCOUNT    ,AD# IS 05240526     ,YES               ,N,N,20210625,
337708,COMICS,298047, 84558,21,21,I, 6,  21.00,               ,     ,SCHOOL PAGE                 ,01-DISPLAY REVENUE  ,17-HOUSE ACCOUNT    ,                    ,CMYK              ,N,N,20210625
2
  • 1
    Added expected outcome Commented Jun 22, 2021 at 14:50
  • If you don't really have blank lines between ever pair of data lines then please edit your example to remove them so we have something we can copy/paste to test with. Commented Jun 22, 2021 at 15:07

1 Answer 1

2

You may use this awk:

awk 'BEGIN{FS=OFS=","} $4 ~ /^[[:blank:]]*$/ {$4 = $15; gsub(/[^[:digit:]]+/, "", $4)} 1' file

379565,COFFEE,297678,05260540,21,21,I, 6,  10.00,               ,     ,                            ,01-DISPLAY REVENUE  ,17-HOUSE ACCOUNT    ,AD#05260540         ,YES               ,N,N,20210625,
380685,COMICS,297634,05240526,21,21,I, 3,  21.00,MAIN NEWS      ,     ,BATHS                       ,01-DISPLAY REVENUE  ,17-HOUSE ACCOUNT    ,AD# IS 05240526     ,YES               ,N,N,20210625,
337708,COMICS,298047, 84558,21,21,I, 6,  21.00,               ,     ,SCHOOL PAGE                 ,01-DISPLAY REVENUE  ,17-HOUSE ACCOUNT    ,                    ,CMYK              ,N,N,20210625

An expanded form:

awk '
BEGIN {FS=OFS=","}
$4 ~ /^[[:blank:]]*$/ {
   $4 = $15
   gsub(/[^[:digit:]]+/, "", $4)
}
1' file
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.