Parse a Unix Command

Question

I am looking to parse the return of the command ll *.dat > output.txt. This produced a text file:

-rw-rw-rw-    1 root     root           16 Apr  8 12:15 01_entries.dat
-rw-rw-rw-    1 root     root           32 Apr 23 15:21 02_entries.dat

I then want to parse the these, so I see 9 tokens pair line:

permissions, num of links value, user, group, size, month, day, time and filename.

input1="output.txt"
while
        IFS='   ' read -r f1 f2 f3 f4 f5 f6 f7 f8 f9
do
        echo "$f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9"
done < "$input1"

But this locks into an eternal loop - any suggestions on how I can get each line to parse correctly?

input1="output.txt" does not make sense, it should be input1=$(cat output.txt). But in fact you can just do while ... do ... done < output.txt — fedorqui
– fedorqui, Commented May 26, 2014 at 10:45
Your code is working for me but I don't think setting IFS=' ' is necessary. — John B
– John B, Commented May 26, 2014 at 13:05

DevSolar · Accepted Answer · 2014-05-26 12:57:42Z

Instead of parsing the output of ls -l (which is pretty error-prone, locale-dependent and all), why not find a command that gives you the desired output directly?

(Yes, I know. Sorry for the bad pun.)

# -maxdepth 1: Maximum recursion depth (1 == current directory only).
# -type f: Files only (no directories).
# -name ...: Name pattern of files to list. Escape the \* so the shell does not.
# -printf ...: Format of the listing, printf() style:
#    %M: File permissions, symbolic view. Use %m for octal.
#    %n: Number of hard links to file.
#    %u: User name if present, or numeric ID. Use %U for numeric ID only.
#    %g: Group name if present, or numeric ID. Use %G for numeric ID only.
#    %s: File size in bytes. Use %b for 512-byte blocks, %k for 1k blocks.
#    %Tm: Last modification timestamp, month (01..12).
#    %Td: Last modification timestamp, day (01..31).
#    %TH: Last modification timestamp, hour (00..23). (Yanks use %TI and %Tp.)
#    %TM: Last modification timestamp, minute (00..59).
#    %p: Filename.
find . -maxdepth 1 -type f -name \*.dat -printf "%M,%n,%u,%g,%s,%Tm,%Td,%TH:%TM,%p\n"

Check man find for more options to -printf, and find in general. (Be especially careful with the date / time fields, since you aren't querying the year here...)

Gratuitious time data quiz question: What is the range of values for %Ts (modification time, seconds)?

00..59? Wrong.
00..60? Wrong.
00..61? Correct.

If you don't know why, better read up on "leap seconds" if you want to write stable time-stamp parsing code. ;-)

declension · Accepted Answer · 2014-05-26 11:03:20Z

2

Sounds like you're trying to produce a CSV file of some sort.

Probably better to do that the proper way (e.g. Python has a great library for this) for a variety of reasons: quoting, commas in filenames, spaces in filenames, important whitepace etc.

However a quick-and-dirty solution in bash might be:

ls -l *.dat | awk '{print $1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", "$9}'

You could of course iterate over the parameters in awk properly, but then, err, it's not as quick and dirty...

Note also how this didn't use the alias ll in your script - it can mean different things to different users. Better to use the binary directly (ls -l).

answered May 26, 2014 at 11:03

declension

4,20524 silver badges27 bronze badges

Collectives™ on Stack Overflow

Parse a Unix Command

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related