2

As we known

uniq [options] [file1 [file2]]

It remove duplicate adjacent lines from sorted file1. The option -c prints each line once, counting instances of each. So if we have the following result:

     34 Operating System
    254 Data Structure
      5 Crypo
     21 C++
   1435 C Language
    589 Java 1.6

And we sort above data using "sort -1knr", the result is as below:

   1435 C Language
    589 Java 1.6
    254 Data Structure
     34 Operating System
     21 C++
      5 Crypo

Can anyone help me out that how to output only the book name in this order (no number)?

5
  • 3
    cut is the magic word (one of them). Commented Oct 1, 2012 at 14:42
  • I think it will not work because there are several space in front of the number, how could you identify the field? Using "cut -d ' ' -f 2"? It will return nothing Commented Oct 1, 2012 at 14:46
  • For example "cut -c 9-", it will ignore the number, but we must know the exact number of characters in the count column Commented Oct 1, 2012 at 15:34
  • Yes, we need to know the width to use cut. Commented Oct 1, 2012 at 16:43
  • you mean -k1nr, not -1knr, right? also, thanks for the useful command! Commented Apr 9, 2013 at 21:29

3 Answers 3

2
uniq -c filename | sort -k 1nr | awk '{$1='';print}'
Sign up to request clarification or add additional context in comments.

3 Comments

What about not using "awk" command? Only use "uniq", "sort", "tr", "wc", "head", "tail"
You already explained why cut is no good in your comment to Michael Krelin. You could use the -c option, but I wouldn't like to depend on the exact number of characters in the count column.
cut isn't no good, but it's the simplest way. I find this solution fine and tool appropriate. Of course you can do something with while and read, but really, awk is exactly the right tool for the task.
2

You can also use sed for that, as follows:

uniq -c filename | sort -k -1nr | sed 's/[0-9]\+ \(.\+\)/\1/g'

Test:

echo "34 Data Structure" | sed 's/[0-9]\+ \(.\+\)/\1/g'
Data Structure

This can also be done with a simplified regex (courtesy William Pursell):

echo "34 Data Structure" | sed 's/[0-9]* *//'
Data Structure

3 Comments

This could be greatly simplified: sed 's/[0-9]* *//g'
Indeed!, although the one you posted (with *) did not work in my tests, it worked with + , I'm adding that to my answer, thanks :-)
You cannot have the g flag in the simplified version. That would munge titles like "20000 leagues under the sea". (My error in including it; muscle memory dies hard.)
0

Why do you use uniq -c to print the number of occurences, which you then want to remove with some cut/awk/sed dance?

Instead , you could just use

sort -u $file1 $file2 /path/to/more_files_to_glob*

Or do some systems come with a version of sort which doesn't support -u ?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.