bash script Rows to Columns

Question

How can I change a file which is like this:

into a file with the rows as columns, but only once, and their according values in columns? Like this:

 A   B     A   D    E
25   26   14   39  42
74   36   81   96  17
23   14   74   87  17

My columns are repeated every 29 rows and some columns, like A, have the same name.

Have you made any attempt?

anubhava
– anubhava

2016-04-09 18:07:02 +00:00
Commented Apr 9, 2016 at 18:07 — anubhava
– anubhava, Commented Apr 9, 2016 at 18:07
The last two days is the only thing that I'm doing :)

jimakos17
– jimakos17

2016-04-09 18:37:17 +00:00
Commented Apr 9, 2016 at 18:37 — jimakos17
– jimakos17, Commented Apr 9, 2016 at 18:37
Don't tell us you made an attempt; show us the attempt.

chepner
– chepner

2016-04-09 19:25:22 +00:00
Commented Apr 9, 2016 at 19:25 — chepner
– chepner, Commented Apr 9, 2016 at 19:25

glenn jackman · Accepted Answer · 2016-04-09 20:24:12Z

5

You can use the following awk script to transform the file:

transform.awk:

{
    # On the first record this loop runs twice. once
    # for the headers once for the first line of data.
    # In all subsequent lines is prints only the data
    # because h==1.
    for(;h<=1;h++){
        for(i=1+h;i<=NF;i+=2){
            printf "%s ", $i
        }
        printf "\n"
    }
    h=1
}

Then execute it like this:

awk -f transform.awk RS='' file

Output:

A B A D E 
25 26 14 39 42 
74 36 81 96 17 
23 14 74 87 17

To get proper aligned columns you can pipe to column -t:

awk -f transform.awk RS='' file | column -t

Output:

A   B   A   D   E
25  26  14  39  42
74  36  81  96  17
23  14  74  87  17

The key here is the usage of the variable RS (record separator). Using an empty string for RS separates records by blank lines. It is the same as setting it to \n\n+ (one or more blank lines). The first record for examples will look like this:

awk by default splits by [[:space:]]+ which includes newlines. This gives us the following fields for record one.

A 25 B 26 A 14 D 39 E 42

The algorithm shown above transforms this fields to the desired output.

edited Apr 9, 2016 at 20:24

glenn jackman

249k42 gold badges233 silver badges362 bronze badges

answered Apr 9, 2016 at 18:48

hek2mgl

159k31 gold badges263 silver badges279 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

jimakos17 Over a year ago

Thank you hek2mgl for your answer but I'm getting: column: Invalid or incomplete multibyte or wide character.

hek2mgl Over a year ago

@EdMorton You are right, input like \n\n\n+ works as well with RS=''!

jimakos17 Over a year ago

I'm still getting the same error. Not sure if I use unicode locale. I suppose your answer is correct but I cannot verify it now on this laptop. I will try it tomorrow in another machine and I'll accept your answer. Thank you very much for your time and effort @hek2mgl

hek2mgl Over a year ago

@jimakos17 You are welcome. What does echo $LANG give you?

jimakos17 Over a year ago

en_US.UTF-8 @hek2mgl I connected to another machine and just reproduced it. Thank you very much again!!!

|

karakfa · Accepted Answer · 2016-04-09 19:23:18Z

2

alternative to awk solution with other unix toolset (used extensively)

$ sed '/^$/d' file    | 
  pr -3ts' '          | 
  tr '\t' ' '         | 
  tr -s ' '           | 
  cut -d' ' -f1,2,4,6 | 
  tr ' ' '\n'         | 
  pr -5ts' '          |
  column -t



A   B   A   D   E
25  26  14  39  42
74  36  81  96  17
23  14  74  87  17

first magic number 3 is number of repeated sections (or number of rows without header) and second magic number 5 is number of items in each section (or number of columns)

answered Apr 9, 2016 at 19:23

karakfa

67.8k8 gold badges45 silver badges59 bronze badges

Comments

glenn jackman · Accepted Answer · 2016-04-09 20:39:49Z

1

For fun, some opaque, perl-ish ruby:

ruby -00 -lane '
    headers, values = $F.each_with_index.partition {|(v,i)| i.even?}
    puts headers.collect(&:first).join(" ") if $. == 1
    puts values.collect(&:first).join(" ")
' file

answered Apr 9, 2016 at 20:39

glenn jackman

249k42 gold badges233 silver badges362 bronze badges

1 Comment

JJoao Over a year ago

Cool! nice solution.

David C. Rankin · Accepted Answer · 2016-04-09 19:29:05Z

And just to round out the mix, you can do it in a fairly flexible manner with a simple script (limited to reading 2-column files formatted as your input file is shown) It will read the data from a filename given as the first argument (or from stdin by default).

The script simply reads column-1 and column-2 into separate indexed arrays (a1 & a2) until a blank line is encountered, and, if it is the first time through, prints the heading row (and sets the heading flag h to not print again), followed by printing the data in a2.

When the end of the file is reached is simply prints the final row of data.

#!/bin/bash

fname="${1:-/dev/stdin}"

declare -i h=0
declare -a a1
declare -a a2

while read -r line; do
    if [ "$line" != "" ]; then
        a1+=( ${line%% *} )
        a2+=( ${line##* } )
    else 
        [ "$h" -eq 0 ] && { printf " %2s" ${a1[@]}; echo ""; h=1; }
        printf " %2s" ${a2[@]}
        echo ""
        unset a1; unset a2;
    fi
done < "$fname"

printf " %2s" ${a2[@]}
echo ""

Use/Output

$ bash r2c.sh dat/r2c.txt
  A  B  A  D  E
 25 26 14 39 42
 74 36 81 96 17
 23 14 74 87 17

JJoao · Accepted Answer · 2016-04-09 23:48:03Z

0

Or a litle bit more reg-exp oriented:

perl -0pE  'say s/\s*\d+\h*\n|\n.*/ /sgr;  s/(^|\n)\w\s*/ /g' file

answered Apr 9, 2016 at 23:48

JJoao

5,5071 gold badge23 silver badges21 bronze badges

Collectives™ on Stack Overflow

bash script Rows to Columns

5 Answers 5

6 Comments

Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

6 Comments

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related