Unix command "uniq" & "sort"

Question

As we known

uniq [options] [file1 [file2]]

It remove duplicate adjacent lines from sorted file1. The option -c prints each line once, counting instances of each. So if we have the following result:

     34 Operating System
    254 Data Structure
      5 Crypo
     21 C++
   1435 C Language
    589 Java 1.6

And we sort above data using "sort -1knr", the result is as below:

   1435 C Language
    589 Java 1.6
    254 Data Structure
     34 Operating System
     21 C++
      5 Crypo

Can anyone help me out that how to output only the book name in this order (no number)?

I think it will not work because there are several space in front of the number, how could you identify the field? Using "cut -d ' ' -f 2"? It will return nothing — eleven
– eleven, Commented Oct 1, 2012 at 14:46
For example "cut -c 9-", it will ignore the number, but we must know the exact number of characters in the count column — eleven
– eleven, Commented Oct 1, 2012 at 15:34
you mean -k1nr, not -1knr, right? also, thanks for the useful command! — gatoatigrado
– gatoatigrado, Commented Apr 9, 2013 at 21:29

Barmar · Accepted Answer · 2012-10-01 14:45:58Z

2

uniq -c filename | sort -k 1nr | awk '{$1='';print}'

answered Oct 1, 2012 at 14:45

Barmar

789k57 gold badges555 silver badges669 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

eleven Over a year ago

What about not using "awk" command? Only use "uniq", "sort", "tr", "wc", "head", "tail"

Barmar Over a year ago

You already explained why cut is no good in your comment to Michael Krelin. You could use the -c option, but I wouldn't like to depend on the exact number of characters in the count column.

Michael Krelin - hacker Over a year ago

cut isn't no good, but it's the simplest way. I find this solution fine and tool appropriate. Of course you can do something with while and read, but really, awk is exactly the right tool for the task.

Nelson Benítez León · Accepted Answer · 2014-01-10 12:26:00Z

2

You can also use sed for that, as follows:

uniq -c filename | sort -k -1nr | sed 's/[0-9]\+ \(.\+\)/\1/g'

Test:

echo "34 Data Structure" | sed 's/[0-9]\+ \(.\+\)/\1/g'
Data Structure

This can also be done with a simplified regex (courtesy William Pursell):

echo "34 Data Structure" | sed 's/[0-9]* *//'
Data Structure

edited Jan 10, 2014 at 12:26

answered Oct 1, 2012 at 15:23

Nelson Benítez León

51.1k8 gold badges70 silver badges84 bronze badges

3 Comments

William Pursell Over a year ago

This could be greatly simplified: sed 's/[0-9]* *//g'

Nelson Benítez León Over a year ago

Indeed!, although the one you posted (with *) did not work in my tests, it worked with + , I'm adding that to my answer, thanks :-)

William Pursell Over a year ago

You cannot have the g flag in the simplified version. That would munge titles like "20000 leagues under the sea". (My error in including it; muscle memory dies hard.)

mivk · Accepted Answer · 2018-08-22 12:44:24Z

0

Why do you use uniq -c to print the number of occurences, which you then want to remove with some cut/awk/sed dance?

Instead , you could just use

sort -u $file1 $file2 /path/to/more_files_to_glob*

Or do some systems come with a version of sort which doesn't support -u ?

answered Aug 22, 2018 at 12:44

mivk

15.4k5 gold badges89 silver badges83 bronze badges

Collectives™ on Stack Overflow

Unix command "uniq" & "sort"

3 Answers 3

3 Comments

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related