using awk substr

Question

I have file with lines like so:

1       17      A       G       R:560:500:60:10.71%:1.6329E-19  Pass:1.0:276:0:57:0:1E0 15      17      0       0       R:24:20:4:16.67%:5.461E-2 R:22:20:2:9.09%:2.4419E-1 R:27:24:3:11.11%:1.1792E-1 R:26:23:3:11.54%:1.1765E-1 A:16:16:0:0%:1E0 A:23:23:0:0%:1E0 A:11:10:1:9.09%:5E-1
1       36      C       T       Y:560:499:61:10.89%:7.7026E-20  Pass:1.0:275:0:58:0:1E0 15      17      0       0       Y:24:20:4:16.67%:5.461E-2 Y:22:20:2:9.09%:2.4419E-1 Y:27:24:3:11.11%:1.1792E-1 Y:26:23:3:11.54%:1.1765E-1 C:16:16:0:0%:1E0 C:23:23:0:0%:1E0 C:11:10:1:9.09%:5E-1

I have been previously using the following awk on liner to extract the first character of each field from $11 onwards.

awk '{n=11; while (n<18) {{$n = substr($n, 0, 1)} n++} print $0}'

I am looking for an easy way to modify it so I can extract only the percentages from these fileds (the value after the 4th colon of the field). The output would look like this:

1       17      A       G       R:560:500:60:10.71%:1.6329E-19  Pass:1.0:276:0:57:0:1E0 15      17      0       0       16.67% 9.09% 11.11% 11.54% 0% 0% 9.09%
1       36      C       T       Y:560:499:61:10.89%:7.7026E-20  Pass:1.0:275:0:58:0:1E0 15      17      0       0       16.67% 9.09% 11.11% 11.54% 0% 0% 9.09%

Cheers.

Dennis Williamson · Accepted Answer · 2012-07-03 16:52:56Z

2

This will print the percentage including the "%":

split($5, arr, ":"); print arr[5]

Adjust the field number in the split() statement to suit your data.

You don't need to use a while loop and manage the increment variable yourself, just use a for loop. Here is a complete, working script using the technique shown above and a for loop:

awk 'BEGIN {OFS = "\t"} {for (n = 11; n < 18; n++) {split($n, arr, ":"); $n = arr[5]}; print $0}'

Sample output:

1   17  A   G   R:560:500:60:10.71%:1.6329E-19  Pass:1.0:276:0:57:0:1E0 15  17  0   0   16.67%  9.09%   11.11%  11.54%  0%  0%  9.09%
1   36  C   T   Y:560:499:61:10.89%:7.7026E-20  Pass:1.0:275:0:58:0:1E0 15  17  0   0   16.67%  9.09%   11.11%  11.54%  0%  0%  9.09%

edited Jul 3, 2012 at 16:52

answered Jul 3, 2012 at 15:36

Dennis Williamson

364k95 gold badges386 silver badges446 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user1308144 Over a year ago

awk '{n=11; while (n<18) {{$n = substr($n, 0, 1)} n++} print $0}'

user1308144 Over a year ago

I am having trouble incorporating it in to the awk one liner. awk '{n=11; while (n<18) {{$n = split($n, arr,":" )} n++} print $0}' is giving me just the number of elements in each array.

Dennis Williamson Over a year ago

@user1308144: Please see my edited answer. split() puts the results in the named array and returns the number of parts.

Collectives™ on Stack Overflow

using awk substr

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related