1

I am trying to do some text file manipulation in Linux.

I have a file, called names.txt that looks like this:

A1
X12
B4
Y5
C10
Z23
B8
C3
Z6

And I need it to look like this:

A01
B04
B08
C03
C10
X12
Y05
Z06
Z23

GOAL: I need to zero-pad the single digits and then alphabetize the results, and save to file sorted_names.txt

I'm thinking I need to count the number of characters per line first, and if the number of characters is less than 2, then add a zero. Lastly I would need to sort alphabetically.

For starters, I think I do this to count the number of characters per line:

cat names.txt | while read line
do

  count=$(echo $line | wc -c)
  echo $line $count

done

Then my thought was to loop through count:

for COUNT in $count
if [( $COUNT = "3" )];
then
    echo doZeroPadHere
fi

4 Answers 4

2

Is it important to you to do it using only built-in Bash features? Because it seems easier to use sed and sort:

<names.txt sed 's/^\([A-Z]\)\([0-9]\)$/\10\2/' | sort >sorted_names.txt
Sign up to request clarification or add additional context in comments.

2 Comments

Hi @ruak. Thanks for your help. Would you be willing to explain your code here so I can understand what you've done?
<names.txt tells the shell to read the file and give the output to sed. Then we get a regex that will replace any line that fits the requirement and leaves all other lines intact. The regex uses ^ to indicate start of a line and $ to indicate end of a line. Essentially, it looks for lines with a single uppercase character \([A-Z]\) and a single digit \([0-9]\), which then are repeated \1 0 \2 (without the spaces). Then using sort to sort the output. voila.
1

Here's a solution using only Bash and sort:

while read line
do 
    printf "%s%02d\n" ${line:0:1} ${line:1}
done <names.txt | sort >sorted.txt

This reads lines from names.txt, and splits each one up into its first character (${line:0:1}) and the rest of the line after the first character (${line:1}). It uses printf (more details) to print the first character verbatim, and the rest of the line as a 0-padded number. It redirects its input from names.txt (avoiding a useless use of cat), pipes the output to sort, and redirects that into sorted.txt.

4 Comments

Hi @Brian. When I try this, I get an error ":invalid number4" but the file sorted.txt is still generated correctly. Do you know what the error is and what it means?
@Sheila It happens because one of the lines contains something other than a number after the first character. ${line:1} means "all of the characters on the line after the first". The %02d part of the printf format string says "print out as a decimal integer 2 characters wide padded with 0". If the rest of the line isn't a number (for instance, your line says A0Z or D20x or something), you will get this error. If you want to find out which lines are the problem, add before the printf: [[ ${line:1} =~ [^0-9] ]] && echo $line >&2; this will print out the lines that cause problems.
Thanks for your comments. The line with the problem was "P24". This doesn't seem to have any different characteristics than the others but it is not included in the sorted.txt file. Any thoughts?
@Sheila I'm not sure; could be a stray invisible character? Maybe a space after the end of the line? Can you post the exact file somewhere?
0

Here's one way using awk and sort:

awk '{ printf "%s%02d\n", substr($0,0,1), substr($0,2) | "sort" }' file

Results:

A01
B04
B08
C03
C10
X12
Y05
Z06
Z23

Comments

0

Here's a Perl way to do it:

perl -lpe's/^([A-Z])(\d)$/${1}0$2/' names.txt

For each line, if it matches exactly one letter and one digit, change it to the letter, a zero, and the digit. Then print.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.