0

I am stuck in a small sorting step. I have a huge file with >300K entries and the file has to be sorted on a specific column containing alphanumeric identifiers as

Rpl12-8
Lrsam1-1
Rpl12-9
Lrsam1-2
Rpl12-10
Lrsam1-5
Rpl12-11
Lrsam1-101
Lrsam2-1
Act-1
Act-100
Act-101
Act-11

The problem is the variable width size, so I am unable to specify the second key identifier (sort -k 1.8n).The first sort is on first alphabet, then on number next to it and then the third number after "-". Can I specifically enable sorting after "-" using delimiter field so then I don't care about width of string.

Desired output would be :

Act-1
Act-11
Act-100
Act-101
Lrsam1-1
Lrsam1-2
Lrsam1-5
Lrsam1-101
Lrsam2-1
Rpl12-8
Rpl12-9
Rpl12-10
Rpl12-11

1 Answer 1

2

With the above data in input.txt:

sort -t- -k1,1 -k2n input.txt

You can change the field delimiter to - with -t, then sort on the first field only (as a string) with -k1,1, and finally the 2nd field (as a number) with -k2n.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the answer, it works great. What if the file is tab separated and above shown example is 5th column. Can I specify two identifiers in same line such as the file is tab seperated but used "-" delimiter for 5th column or I might have to use in conjugation with awk etc.
My answer would only work for that column by itself, if it is a single column in a tab-separated file, then you will have to use another tool as well. For example you could replace the hyphens with a tab, then sort on both columns, then replace that tab with a hyphen again. How many columns are there in total? What data do the other columns contain?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.