1

In a shell script, I'm trying to dynamically take output from a command and make it into a json object. Thankfully the output is separated into a format that should make it easy, in that the output is effectively "key":"value". That should be easy, right? But I've been fiddling with and reading through so many threads, my head is spinning now and I'm not sure what route to take now.

I can't seem to figure how to do this in a bash script (it needs to be a shell script) that will run on a vanilla macOS host (no added commands using brew or compiling).

I'm trying to take a command like 'diskutil info /dev/disk2', that outputs data like this:

   Device Identifier:         disk2
   Device Node:               /dev/disk2
   Whole:                     Yes
   Part of Whole:             disk2
   Device / Media Name:       DataTraveler G3

   Volume Name:               Not applicable (no file system)
   Mounted:                   Not applicable (no file system)
   File System:               None

   Content (IOContent):       GUID_partition_scheme
   OS Can Be Installed:       No
   Media Type:                Generic
   Protocol:                  USB
   SMART Status:              Not Supported

   Disk Size:                 31.0 GB (30967529472 Bytes) (exactly 60483456 512-Byte-Units)
   Device Block Size:         512 Bytes

   Read-Only Media:           No
   Read-Only Volume:          Not applicable (no file system)

   Device Location:           External
   Removable Media:           Removable
   Media Removal:             Software-Activated

   Virtual:                   No

And get a (valid) JSON object, that looks like this:

{   "Device Identifier" : "disk2",
    "Device Node" : "/dev/disk2",
    "Whole" : "Yes",
    "Part of Whole" : "disk2",
    "Device / Media Name" : "DataTraveler G3",
    "Volume Name" : "Not applicable (no file system)",
    "Mounted" : "Not applicable (no file system)",
    "File System" : "None",
    "Content (IOContent)" : "GUID_partition_scheme",
    "OS Can Be Installed" : "No",
    "Media Type" : "Generic",
    "Protocol" : "USB",
    "SMART Status" : "Not Supported",
    "Disk Size" : "31.0 GB (30967529472 Bytes) (exactly 60483456 512-Byte-Units)",
    "Device Block Size" : "512 Bytes",
    "Read-Only Media" : "No",
    "Read-Only Volume" : "Not applicable (no file system)",
    "Device Location" : "External",
    "Removable Media" : "Removable",
    "Media Removal" : "Software-Activated",
    "Virtual" : "No"
}

Now, yes, I could build this by hand using awk or something to look for "Device Identifier:" and assigning $0 to the value. But the results of this diskutil command are dynamic and change depending on what type of storage is at the device file (/dev/disk2 vs /dev/disk3 vs /dev/disk4, etc). So that's out.

The command diskutil has an option to create a -plist, and macOS has a utility called plutil that can convert a plist to a json... yay, right? Except 1) the plist has to be written to a file, because plutil only reads from a file, and 2) the json object's formatting isn't valid. Thanks for nothing. So that's out.

I can use sed to grab the strings before the colon and put them in a variable for keys. I can use sed to grab the strings after the colon and put them in a variable for values. But then I can't fathom how to zip them together into pairs. I wondered if I could create an array and create the key/value pairs on the fly, one at a time, but I can't seem to figure that out either.

Maybe I'm tired, or maybe this is impossible (or incredibly hard using vanilla commands), but I'd appreciate any guidance anyone had that might help me move the rock forward a bit. Thanks!

(SIDE NOTE: It just dawned on me to look at 'apropos json' on this macOS host, so delving into that as an option... bah, dead end.)

1
  • Why does it "need to be a bash script"? A vanilla MacOS host has Python out-of-the-box. Indeed, even if you were to double down on the needs-to-be-a-bash-script, the best answers will be a bash script that starts Python, because Python has JSON libraries and bash doesn't. Commented May 7, 2021 at 23:27

3 Answers 3

3

If hyphens are used, plutil can read from stdin and write to stdout. Json from the following can be consumed by jq.

diskutil info -plist /dev/disk1 | plutil -convert json -r -o - -
Sign up to request clarification or add additional context in comments.

Comments

0
#!/bin/perl
# ./r.pl < diskutil info /dev/disk2'
my $w='{' . "\n";
while(<>) {
chomp;
my $a = $_;
my ($m1,$m2) = split(':',$_);
if($m1) {
$m1 =~ s/^\s+//;
$m2 =~ s/^\s+//;
$cm = eof() ? '':',';
$w .= '"' . $m1 . '":"' . $m2 . '"' . $cm . "\n";
}
}
print $w . '}';

2 Comments

What happens when your volume name has quotes in it and needs to be changed from including " to \"? Or a backslash that needs to be doubled to be valid JSON, or a tab that needs to be changed to \t, or a nonprintable character that needs to be changed to a \u#### escape?
(Also, ./r.pl < diskutil info /dev/disk2 is the same as ./r.pl info /dev/disk2 <diskutil; it's not the same as diskutil info /dev/disk2 | ./r.pl)
0

Use a language (such as Python, included in stock MacOS) that provides JSON libraries. Your sample data here isn't doing anything that simple string manipulation techniques can't handle, but as soon as you have a volume label that contains quotes or backslashes in its name, the easy answers break badly; by contrast, the below will always emit valid JSON as its output.

#!/bin/sh
diskutil info /dev/disk2 | python -c '
import re, sys, json

content_re = re.compile(r"^\s*(\S[^:]+):\s*(\S(?:.*\S)?)\s*$")
content = {}

for line in sys.stdin:
    match = content_re.match(line)
    if match == None:
        continue
    content[match.group(1)] = match.group(2)
json.dump(content, sys.stdout, indent=4)
'

...properly emits as output (given a diskutil emitting your stated input):

{
    "Device Identifier": "disk2",
    "Device Node": "/dev/disk2",
    "Whole": "Yes",
    "Part of Whole": "disk2",
    "Device / Media Name": "DataTraveler G3",
    "Volume Name": "Not applicable (no file system)",
    "Mounted": "Not applicable (no file system)",
    "File System": "None",
    "Content (IOContent)": "GUID_partition_scheme",
    "OS Can Be Installed": "No",
    "Media Type": "Generic",
    "Protocol": "USB",
    "SMART Status": "Not Supported",
    "Disk Size": "31.0 GB (30967529472 Bytes) (exactly 60483456 512-Byte-Units)",
    "Device Block Size": "512 Bytes",
    "Read-Only Media": "No",
    "Read-Only Volume": "Not applicable (no file system)",
    "Device Location": "External",
    "Removable Media": "Removable",
    "Media Removal": "Software-Activated",
    "Virtual": "No"
}

4 Comments

Regex will not match when value is a single character.
Obviously, yes. When writing it, I considered that better than potentially capturing blank spaces at the ends of lines as part of the data. That said, it's been amended to address the case.
Is it simpler to do ...:\s*(.*?)\s*$ ?
That's certainly also a reasonable approach.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.