0

I have a CSV file like this:

fish,4
cat,1
elephant,1
tree,2
dog,8
car,10

awk -F',' '{print length($1),$0}' file.csv | sort -k1nr | cut -d' ' -f 2- will sort the file by word length, for all words appearing in the first column:

elephant,1
fish,4
tree,2
cat,1
dog,8
car,10

sort -t, -k+2 -n -r file.csv will sort the file from greatest to least according to the number appearing in the second column:

car,10
dog,8
fish,4
tree,2
elephant,1
cat,1

How can I use these two commands together such that the CSV file is first sorted by word length, according to the words appearing in the first column, then any rows containing words of equal length within the first column are sorted according to the number appearing in the second column from greatest to least. The resulting output would look like this:

elephant,1
fish,4
tree,2
car,10
dog,8
cat,1

How can these two sorting methods be used together?

3 Answers 3

5

try this line:

awk -F, '{print length($1)","$0}' file|sort -t, -rn  -k1 -k3|sed 's/[^,],//'

will give you:

elephant,1
fish,4
tree,2
car,10
dog,8
cat,1

idea is, first add the length of col1 to output, then sort the output of awk with two columns, finally remove added length column (the first column) to get final result.

Sign up to request clarification or add additional context in comments.

Comments

1

If you are using then you can use the asort function to perform sort, so no other utility has to be called. You can try something like this:

awk -F, 'function cmp(i1,v1,i2,v2) {split(v1,a1); split(v2,a2)
  l1=length(a1[1]); l2=length(a2[1])
  return l1 > l2 ? -1 : l1 < l2 ? 1 : a1[2] > a2[2] ? -1 : a1[2] < a2[2]
}
{a[n++]=$0}
END{asort(a,a,"cmp"); for(i in a) print a[i]}' infile

Output:

elephant,1
fish,4
tree,2
car,10
dog,8
cat,1

This script reads all the lines first then it sorts the array called a with the function cmp. The only trick I used that a > b returns the usual 1 or 0 for true or false.

A little bit shorter version in :

perl -F, -ane 'push @a,[@F]; 
  END{for $i(sort {length $b->[0]<=>length $a->[0] or $b->[1]<=>$a->[1]} @a) {printf "%s,%d\n", @$i}
}' infile

This is not 100% correct as $F[1] contains the \n, but printf handles it properly.

Comments

0

Reverse the order of the sorts, then make the second sort stable with -s.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.