How I extract some coloumn from text file using bash scripting

Question

I have one file as bellow and I want to format it as the result shown below, please I need your help. Thanks in advance. The file is so huge.

File:

X  1846      211L  33599.00  51209.001    1  4321
X  1846      211L  33599.00  51209.001  433  8641
X  1846      211L  33599.00  51209.001  865 12961
X  1846      211L  33599.00  51209.001 1297 17281
X  1846      221L  34031.00  52085.002    1  2211
X  1846      221L  34031.00  52085.002  222  4421
X  1846      221L  34031.00  52085.002  443  6631
X  1846      221L  34031.00  52085.002  664  8841
.....
X  1846     1001L  34023.00  51785.002    1  3711
X  1846     1001L  34023.00  51785.002  372  7421
X  1846     1001L  34023.00  51785.002  743 11131
X  1846     1001L  34023.00  51785.002 1114 14841
....
X  1846     9991L  34027.00  51353.002    1  4321
X  1846     9991L  34027.00  51353.002  433  8641
X  1846     9991L  34027.00  51353.002  865 12961
X  1846     9991L  34027.00  51353.002 1297 17281
X  1846    10001L  33593.00  51053.001    1  4321
X  1846    10001L  33593.00  51053.001  433  8641
X  1846    10001L  33593.00  51053.001  865 12961
X  1846    10001L  33593.00  51053.001 1297 17281
.....

Result:

1846   21 33599 51209
1846   22 34031 52085
...
1846  100 34023 51785
...
1846  999 34027 51353
1846 1000 33593 51053
...

I want for 21 for example two space before 21 to let this 1 be in the same column as the 0 in 1000:

  21
1000

thanks in advance.

You actually don't have to past the whole table if it's that long. First three or four lines would be sufficient. See also StackOverflow help page about Markdown rules it uses. — firegurafiku
– firegurafiku, Commented Apr 10, 2016 at 11:21

Lars Fischer · Accepted Answer · 2016-04-10 12:56:07Z

2

You can use awk (at least I tested with gnu awk) like this:

awk -v FS="([ .]+|1L)" '{printf("%4d %4d %5d %5d\n", $2, $3, $5, $7)}'  your_file

It uses 1L and . as additional field delimiters, thus we get the decimal parts .00 as fields (that we ignore). And the 1L is also ingored, by this delimiter setting.

If you dont like the awk solution, you could also use the tool cut and provide the desired output chars with the -c option:

cut -c 4-8,12-15,19-24,29-34 your_file

edited Apr 10, 2016 at 12:56

answered Apr 10, 2016 at 11:39

Lars Fischer

10.4k3 gold badges31 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Walter A Over a year ago

@H.R When you want to use cut, look at the longest numbers and thing what offsets you want. I wanted to cut -c4-15,18-24,28-34 including some spaces.

firegurafiku · Accepted Answer · 2016-04-10 11:31:14Z

1

Given that the input looks exactly as presented in the question, left-filled with spaces and not tabs, there is a quick and dirty colrm approach:

$ cat input.txt | colrm 35 | colrm 25 27 | colrm 16 18 | colrm 1 3
1846      21 33599  51209
1846      21 33599  51209
1846      21 33599  51209
1846      21 33599  51209
...

Note that I'd not recommend using this if you want you data processing to be robust and reliable. Maybe you should consider using more appropriate tool than Bash.

answered Apr 10, 2016 at 11:31

firegurafiku

3,1261 gold badge30 silver badges38 bronze badges

1 Comment

firegurafiku Over a year ago

@H.R: BTW, the cut approach in Lars's answer seems to be superior, as it doesn't lead to spawning multiple processes for that simple task.

Collectives™ on Stack Overflow

How I extract some coloumn from text file using bash scripting

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related