2

I have a list of files, with full path, that I need to sort in a bash shell.

The list will look like

/total/path/software/version1.2.3.4/filename.10.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.1.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.2.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.12.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.1.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.3.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.18.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.20.cfg -- infomation grepped
/real/path/software/version1.2.3.4/filename.4.cfg -- infomation grepped
/total/path/software/version1.2.3.4/filename.5.cfg -- infomation grepped

I need to have the list first sorted by the path, and then by the filename number.

I've tried:

 sort -t'.' -k 1,1 -k 2,5n fileame.txt

But it only ever sorts by the path. If I do:

sort -t'.' -k5n filename.txt

It works fine. How can I get the filenames in numeric order, after sorting by path?

Thanks

2
  • Do ALL your paths have the same number of components in them? Commented May 23, 2013 at 23:34
  • Yes. Every line follows the same pattern. Commented May 24, 2013 at 15:20

3 Answers 3

1

You need to sort up to filename first and then specify the filename number as a tie-breaker

sort -t'.' -k1,4 -k5n,5n filename.txt
/full/path/software/version1.2.3.4/filename.1.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.3.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.12.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.20.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.1.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.2.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.18.cfg -- infomation grepped
/real/path/software/version1.2.3.4/filename.4.cfg -- infomation grepped
/total/path/software/version1.2.3.4/filename.5.cfg -- infomation grepped
/total/path/software/version1.2.3.4/filename.10.cfg -- infomation grepped
Sign up to request clarification or add additional context in comments.

Comments

1

is this what you are looking for ?

 Kaizen ~
 $ for ch in `sort testfile.txt | cut -c2-3 | uniq `
 > do
 > sed -n "/^\/$ch/p" testfile.txt | sort -t'.' -k5n ;
 > done ;

result :

/full/path/software/version1.2.3.4/filename.1.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.3.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.12.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.20.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.1.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.2.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.18.cfg -- infomation grepped
/real/path/software/version1.2.3.4/filename.4.cfg -- infomation grepped
/total/path/software/version1.2.3.4/filename.5.cfg -- infomation grepped
/total/path/software/version1.2.3.4/filename.10.cfg -- infomation grepped

the approach is same as yours , i just added sed !!

Comments

1

I would create a sort key, and then sort on that sort key, then remove the sort key:

Let's see...

$ while read line
do
    dirname=${line%/*}   #Directory names
    number=$(echo "$line" | sed 's/.*\.\([0-9]*\)\.cfg.*/\1/')  # File number
    printf "%-60.60s %04d | %s\n" "$dirname" "$number" "$line"
done < filetext.txt | sort | sed "s/.* \| //"

This is reading in each line from filetext.txt and piping it into the while read line loop.

The dirname is using the ${parameter%word} feature in BASH. This takes the value of ${parameter} and removes the smallest amount from the right side that matches the pattern word. Thus, ${line%/*} is taking $line, and is removing the last forward slash and all characters after that.

The number was a bit trickier. I noticed that you had something like .44.cfg at the end of the file name. That meant if I could find that particular pattern, I could find the file number. My sed command looks for a period, followed by zero or more numbers, followed by .cfg., and marks the numbers as a grouping. I then replace the entire line with the first grouping giving me the number.

Next, I print out the directory and the number using printf. I space fill the directory name to sixty characters (that could be increased if needed) and then a four digit number. This creates a sort key that looks like this:

/full/path/software/version1.2.3.4                           0001
/full/path/software/version1.2.3.4                           0003
/full/path/software/version1.2.3.4                           0012
/full/path/software/version1.2.3.4                           0020
/long/path/software/version1.2.3.4                           0001
/long/path/software/version1.2.3.4                           0002
/long/path/software/version1.2.3.4                           0018
/real/path/software/version1.2.3.4                           0004
/total/path/software/version1.2.3.4                          0005
/total/path/software/version1.2.3.4                          0010

I append the line to this sort key, and then do my sort. After that, I remove the sort key from the line. The results:

/full/path/software/version1.2.3.4/filename.1.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.3.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.12.cfg -- infomation grepped
/full/path/software/version1.2.3.4/filename.20.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.1.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.2.cfg -- infomation grepped
/long/path/software/version1.2.3.4/filename.18.cfg -- infomation grepped
/real/path/software/version1.2.3.4/filename.4.cfg -- infomation grepped
/total/path/software/version1.2.3.4/filename.5.cfg -- infomation grepped
/total/path/software/version1.2.3.4/filename.10.cfg -- infomation grepped

Note I'm not depending upon a particular format for the file name as others have in their answers. What if you had a line like this?

/total/path/software/version1.2/filename.10.cfg -- infomation grepped

There aren't five decimal places in that line. Anything that is attempting to sort by breaking the fields via the periods will fail. The above will still work.

2 Comments

That's a lot of extra steps but I appreciate the answer. Luckily, the version number format never changes. Even if it means you have version 3.0.0.0
I knew that sort -t'.' -k1,4 -k5n,5n filename.txt would work, but since it was already suggested, I simply decide to take a different route. The sort key is a good technique when formatting might be more iffy. Even here, the period is a mere side effect.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.