3

I had a jumbled up file content as follows:

13,13,GAME_FINISH,
1,1,GAME_START,
1,1,GROUP_FINISH,
17,17,WAGER,200.00
2,2,GAME_FINISH,
2,2,GAME_START,
22,22,GAME_WIN,290.00
2,2,GROUP_FINISH,
32,32,WAGER,200.00
3,3,GAME_FINISH,
3,3,GAME_START,
.... more lines

I sorted it and currently hold the file content in following format:

1,1,GAME_FINISH,
1,1,GAME_START,
1,1,GROUP_FINISH,
1,1,WAGER,200.00
2,2,GAME_FINISH,
2,2,GAME_START,
2,2,GAME_WIN,290.00
2,2,GROUP_FINISH,
2,2,WAGER,200.00
3,3,GAME_FINISH,
3,3,GAME_START,
3,3,GROUP_FINISH,
3,3,WAGER,200.00
... more lines

But how can I sort it better to obtain following format? 3rd and 4th line may not always exist.

1,1,WAGER,200.00
1,1,GAME_START,
1,1,GAME_WIN,500.00
1,1,BONUS_WIN_1,1100.00
1,1,GAME_FINISH,
1,1,GROUP_FINISH,
2,2, more lines...

For the initial sort, I used

sort -t, -g -k2 nameofunsortedfile.csv >> sortedfile.csv

Added Information:

I want to sort it in this order - Wager, game start, game win, bonus win, game finish, group finish. My current sorted is not in this order. Game win and bonus win may not always be present.

The order I am expecting is not dictionary but also not random. Every number always has a wager, start, game_finish group_finish sequence. game_win, game_bonus are optional. Looking for a way to example target 1,1 sort in the expected sequence mentioned, move on to 2,2 do the same and so on.

0

2 Answers 2

2

The most straightforward way to do this with standard UNIX utilities is probably to add an additional field to each line, which encodes the type of record in a way that sorts into the order you want.

declare -A mapping=( ["WAGER"]=1 ["GAME_START"]=2 ["GAME_WIN"]=3 ["BONUS_WIN"]=4 ["GAME_FINISH"]=5 ["GROUP_FINISH"]=6 )
cut -d, -f3 filename.txt | while read; do echo ${mapping["$REPLY"]}; done | paste -d, - filename.txt | sort | sort -s -t, -n -k 2,3 | cut -d, -f 2-

The declare statement declares a mapping that allows you to look up the ordering of each record type. The specific values (1, 2, etc.) don't matter as long as they sort into the order you want; you could use letters or words if you prefer.

Then the next line consists of the following commands:

  • cut -d, -f3 filename.txt extracts the thing you want to sort by (WAGER or whatever)
  • while read; do echo ${mapping["$REPLY"]}; done takes each value (WAGER etc.) and replaces it with its corresponding sortable value from the associative array mapping
  • paste -d, - filename.txt sticks those values back on to the start of each line from filename.txt
  • sort | sort -s -t, -n -k 2,3 has the effect of sorting by field 2, then field 3, then field 1 (the one we added). If sort could use three fields as keys, we could do this in a single sort command, but it only accepts up to two fields to sort by.
  • cut -d, -f 2- strips off the added field, leaving you with your original records, but in sorted order
Sign up to request clarification or add additional context in comments.

2 Comments

Works great! Just made a slight adjustment to text from GROUP_WIN to GAME_FINISH. Thank you for the detailed explanation.
@kar Ah, yes, that was a typo on my part. Sorry!.
1

Perl to the rescue:

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

my $i = 1;
my %order = map { $_ => $i++ }
            qw( WAGER GAME_START GAME_WIN BONUS_WIN GAME_FINISH GROUP_FINISH );

chomp( my @lines = <> );
say join ',', @$_ for sort {
    $a->[0] <=> $b->[0]
    || $order{ $a->[2] } <=> $order{ $b->[2] }
} map [ split /,/ ], @lines;

The sort block tells Perl to first sort by the first column, and if the values are the same, use the "order" corresponding to the third one.

1 Comment

Limited Perl knowledge. Understand what you mean in the explanation. Will test it out. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.