Command output with empty values to csv

Question

> lsblk -o NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE -x NAME

NAME      LABEL FSTYPE MOUNTPOINT      SIZE TYPE
nvme0n1                              894.3G disk
nvme0n1p1              [SWAP]            4G part
nvme0n1p2                                1G part
nvme0n1p3 root         /home/cg/root 889.3G part

I need the output of this command in csv format, but all the methods I've tried so far don't handle the empty values correctly, thus generating bad rows like these I got with sed:

> lsblk -o NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE -x NAME | sed -E 's/ +/,/g'

NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE
nvme0n1,894.3G,disk
nvme0n1p1,[SWAP],4G,part
nvme0n1p2,1G,part
nvme0n1p3,root,/home/cg/root,889.3G,part

Any idea how to add the extra commas for the empty fields?

NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE
nvme0n1,,,,894.3G,disk

For sample output nvme0n1,,,,894.3G,disk could you please explain how there are 4 commas there? Because there are more spaces so shouldn't commas be more than 4? Kindly confirm once. — RavinderSingh13
– RavinderSingh13, Commented Oct 23, 2022 at 15:54
@RavinderSingh13 there are 4 commas to leave 3 empty places, for LABEL, FSTYPE and MOUNTPOINT — Luca Clavarino
– Luca Clavarino, Commented Oct 23, 2022 at 16:02
@AndreWildberg yes, I've seen that option and it could actually solve my issue (json is an acceptable format as well) but I should apply the same logic to other commands that don't offer an option for json export. lsblk just seemed an easy sample to present the issue — Luca Clavarino
– Luca Clavarino, Commented Oct 23, 2022 at 16:06
My feeling is that if lsbkk has no option to output tabs as separator in this output format it's gonna get very tricky to reformat reliably. — Andre Wildberg
– Andre Wildberg, Commented Oct 23, 2022 at 16:16

Ljm Dullaart · Accepted Answer · 2022-10-23 16:26:31Z

1

Make sure that the fields that are possibly empty are at the end of the line. And then re-arrange them in the required sequence.

lsblk -o NAME,SIZE,TYPE,FSTYPE,MOUNTPOINT,LABEL -x NAME   | awk '{ print $1,";",$6,";",$4,";",$5,";",$2,";",$3 }'

answered Oct 23, 2022 at 16:26

Ljm Dullaart

5,0092 gold badges19 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Luca Clavarino Over a year ago

This doesn't really solve all my problems, since I have different commands for which I need to apply the same logic, and not all of them supports columns sorting, but it's a very nice solution for my specific example. I'm gonna keep the question open for a while, just to see if some broader solution comes up otherwise I'm gonna accept this one.

KamilCuk · Accepted Answer · 2022-10-23 22:17:56Z

1

Just:

lsblk -o NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE -x NAME -r | tr ' ' ','

answered Oct 23, 2022 at 22:17

KamilCuk

146k8 gold badges84 silver badges154 bronze badges

4 Comments

Luca Clavarino Over a year ago

Another clean solution, even if this too needs the exact column sorting. This is not a problem with this particular example, but makes it not very "portable" to other commands.

KamilCuk Over a year ago

Uhm, thanks. I do not understand. What is "exact column sorting"? What is not portable?

Luca Clavarino Over a year ago

I mean, this works if the columns are in that exact order (I should've said "ordering", sorry) but not if you change it. Since I would apply the same logic to different commands, this works only with a command that lets you order the columns of its output.

KamilCuk Over a year ago

No, this only specifically only works with this command lsblk specifically. There is no ultimate golden solution for any command. You question also does not ask about any unix command, only specifically about lsblk (as I understood it). There is a (very big) project github.com/kellyjonbrazil/jc that tries to parse any linux command. Every single command is different, there is no swiss knife.

Ljm Dullaart · Accepted Answer · 2022-10-23 17:49:54Z

Not really bash, but a quick and dirty Perl would be something like:

my $state=0;
my @input=<>;
my $maxlength=0;
for my $line ( 0 .. $#input){
        my $curlength= length($input[$line]);
        if ($curlength>$maxlength){$maxlength=$curlength;}
}
my $fill=' ' x $maxlength;
for my $line ( 0 .. $#input){
        chomp $input[$line];
        $input[$line]="$input[$line] $fill";
}


for (my $pos=0; $pos<$maxlength; $pos++){
        my $spacecol=1;
        for my $line ( 0 .. $#input){
                if (substr($input[$line],$pos,1) ne ' '){
                        $spacecol=0;
                }
        }
        if ($spacecol==1){
                for my $line ( 0 .. $#input){
                        substr($input[$line],$pos,1)=';';
                }
        }
}

for my $line ( 0 .. $#input){
        print "$input[$line]\n";
}

markp-fuso · Accepted Answer · 2022-10-23 23:20:45Z

Assumptions:

output format is fixed-width
header record does not contain any blank fields
no fields contain white space (ie, only white space occurs between fields)

Design overview:

parse header to get initial index for each field; if all columns are left-justified then this would be all we need to do however, with the existence of right-justified columns (eg, SIZE) we need to look for right-justified values that are longer than the associated header field (ie, the value has a lower index than the associated header)
for non-header rows we loop through our set of potential fields, using substr()/match() to find the non-space fields in the line and ...
if said field starts and ends before the next field's index then add the field's value to our output variable but ...
if said field starts before next field's index but ends after next field's index then we're looking at a right-justified value of the next field which happens to have an earlier index than the associated header's index; in this case update the index for the next field and add a blank value (for the current field) to our output variable
if said field starts after the index of the next field then the current field is empty; again, add the empty/blank value to our output variable
once we've completed processing a line of input print the output to stdout

One awk idea:

awk '
BEGIN   { OFS="," }

# use header record to determine initial set of indexes

FNR==1  { maxNF=NF   
          header=$0
          out=sep=""
          for (i=1;i<=maxNF;i++) {
              match(header,/[^[:space:]]+/)                             # find first non-space string
              ndx[i]=ndx[i-1] + prevlen + RSTART - (i==1 ? 0 : 1)       # make note of index
              out=out sep substr(header,RSTART,RLENGTH)                 # add value to our output variable
              sep=OFS
              prevlen=RLENGTH                                           # need for next pass through loop
              header=substr(header,RSTART+RLENGTH)                      # strip off matched string and repeat loop
          }
          print out                                                     # print header to stdout
          ndx[1]=1                                                      # in case 1st field is right-justified, override index and set to 1
          next
        }

# for rest of records need to determine which fields are empty and/or which fields need the associated index updated

        { out=sep=""
          for (i=1;i<maxNF;i++) {                                       # loop through all but last field
              restofline=substr($0,ndx[i])                              # work with current field thru to end of line
              if ( match(restofline,/[^[:space:]]+/) )                  # if we find a non-space match ...
                 if ( ndx[i]-1+RSTART < ndx[i+1] )                      # if match starts before index of next field and ...
                    if ( ndx[i]-1+RSTART+RLENGTH < ndx[i+1] )           # ends before index of next field then ...
                       out=out sep substr(restofline,RSTART,RLENGTH)    # append value to our output variable
                    else {                                              # else if match finished beyond index of next field then ...
                       out=out sep ""                                   # this field is empty and ...
                       diff=ndx[i+1]-(ndx[i]+RSTART-1)                  # figure the difference and ...
                       ndx[i+1]-=diff                                   # update the index for the next field
                    }
              else                                                      # current field is empty
                 out=out sep ""
              sep=OFS
          }
          field=substr($0,ndx[maxNF])                                   # process last field
          gsub(/[[:space:]]/,"",field)                                  # remove all remaining spaces
          print out, field                                              # print new line to stdout
        }
' lsblk.out

This generates:

NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE
nvme0n1,,,,894.3G,disk
nvme0n1p1,,,[SWAP],4G,part
nvme0n1p2,,,,1G,part
nvme0n1p3,root,,/home/cg/root,889.3G,part

Collectives™ on Stack Overflow

Command output with empty values to csv

4 Answers 4

1 Comment

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related