0

I am looking to parse the return of the command ll *.dat > output.txt. This produced a text file:

-rw-rw-rw-    1 root     root           16 Apr  8 12:15 01_entries.dat
-rw-rw-rw-    1 root     root           32 Apr 23 15:21 02_entries.dat

I then want to parse the these, so I see 9 tokens pair line:

permissions, num of links value, user, group, size, month, day, time and filename.

input1="output.txt"
while
        IFS='   ' read -r f1 f2 f3 f4 f5 f6 f7 f8 f9
do
        echo "$f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9"
done < "$input1"

But this locks into an eternal loop - any suggestions on how I can get each line to parse correctly?

2
  • input1="output.txt" does not make sense, it should be input1=$(cat output.txt). But in fact you can just do while ... do ... done < output.txt Commented May 26, 2014 at 10:45
  • Your code is working for me but I don't think setting IFS=' ' is necessary. Commented May 26, 2014 at 13:05

2 Answers 2

3

Instead of parsing the output of ls -l (which is pretty error-prone, locale-dependent and all), why not find a command that gives you the desired output directly?

(Yes, I know. Sorry for the bad pun.)

# -maxdepth 1: Maximum recursion depth (1 == current directory only).
# -type f: Files only (no directories).
# -name ...: Name pattern of files to list. Escape the \* so the shell does not.
# -printf ...: Format of the listing, printf() style:
#    %M: File permissions, symbolic view. Use %m for octal.
#    %n: Number of hard links to file.
#    %u: User name if present, or numeric ID. Use %U for numeric ID only.
#    %g: Group name if present, or numeric ID. Use %G for numeric ID only.
#    %s: File size in bytes. Use %b for 512-byte blocks, %k for 1k blocks.
#    %Tm: Last modification timestamp, month (01..12).
#    %Td: Last modification timestamp, day (01..31).
#    %TH: Last modification timestamp, hour (00..23). (Yanks use %TI and %Tp.)
#    %TM: Last modification timestamp, minute (00..59).
#    %p: Filename.
find . -maxdepth 1 -type f -name \*.dat -printf "%M,%n,%u,%g,%s,%Tm,%Td,%TH:%TM,%p\n"

Check man find for more options to -printf, and find in general. (Be especially careful with the date / time fields, since you aren't querying the year here...)


Gratuitious time data quiz question: What is the range of values for %Ts (modification time, seconds)?

  • 00..59? Wrong.
  • 00..60? Wrong.
  • 00..61? Correct.

If you don't know why, better read up on "leap seconds" if you want to write stable time-stamp parsing code. ;-)

Sign up to request clarification or add additional context in comments.

1 Comment

This is an idea I had not thought about! Thanks for this!
2

Sounds like you're trying to produce a CSV file of some sort.

Probably better to do that the proper way (e.g. Python has a great library for this) for a variety of reasons: quoting, commas in filenames, spaces in filenames, important whitepace etc.

However a quick-and-dirty solution in bash might be:

ls -l *.dat | awk '{print $1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", "$9}'

You could of course iterate over the parameters in awk properly, but then, err, it's not as quick and dirty...

Note also how this didn't use the alias ll in your script - it can mean different things to different users. Better to use the binary directly (ls -l).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.