0

I've read and attempted to extract a substring from a given string with awk, sed or grep but I am unable to get it working or think how to accomplish this.

I have the string below which describes configurations of my VMs:

config: diskSizeGb: 100 diskType: pd-standard imageType: COS_CONTAINERD machineType: e2-micro metadata: disable-legacy-endpoints: 'true' preemptible: true status: RUNNING version: 1.19.9

How can I extract a substring for example, "preemptible: true" or "status: RUNNING" knowing that the values can be different for each VM?

Thank you!

0

2 Answers 2

1

Assumptions:

  • the VM config name/value pairs may not be in the same order
  • config names and values are single strings with no embedded white space
  • each config name is preceded by (at least) one space, and followed immediately by a colon (:)
  • there may be multiple spaces between the colon (:) and the config value; we want to maintain these spaces in the output

One idea using sed and a capture groups:

# note: extra spaces placed between 'version:' and '1.19.9'

cfg_string="config: diskSizeGb: 100 diskType: pd-standard imageType: COS_CONTAINERD machineType: e2-micro metadata: disable-legacy-endpoints: 'true' preemptible: true status: RUNNING version:   1.19.9"

for config in preemptible status version
do
        echo "++++++++++++++ ${config}"
        sed -nE "s/.* (${config}:[ ]*[^ ]*).*/\1/p" <<< "${cfg_string}"
done

sed details:

  • -nE - disable default printing of the input (we'll use /p to explicitly print our capture group; enable Extended regex support
  • .* (${config}:[ ]*[^ ]*).* - match variable number of characters (.*) + a space ( ) + ${config} + a colon (:) + one or more spaces ([ ]*) + everything that follows that is not a space ([^ ]*) + the rest of the input (.*); the parens mark the start/end of the capture group (only one capture group in this case)
  • \1 - reference capture group #1 (ie, everything inside of the parens)
  • /p - print (the capture group)

This generates:

++++++++++++++ preemptible
preemptible: true
++++++++++++++ status
status: RUNNING
++++++++++++++ version
version:   1.19.9                # extra spaces maintained

NOTES:

  • obviously an invalid config name (eg, stat, versions) is going to produce no output
  • the sed results could be captured in a variable for further testing/processing (would address issue of an invalid config name)
Sign up to request clarification or add additional context in comments.

3 Comments

wow @markp-fuso, that was exactly what I was looking for and you are right with your assumptions. It can change orders based on each VM. Is there any chance you can break down the sed command so I can understand what each piece does?
@CaioT added some notes about sed
Much appreciated. Thank you so much!
0

Here a possibile solution:

#!/bin/bash

data="config: diskSizeGb: 100 diskType: pd-standard imageType: COS_CONTAINERD machineType: e2-micro metadata: disable-legacy-endpoints: 'true' preemptible: true status: RUNNING version: 1.19.9"

preemptible=$(echo ${data} | cut -d ' ' -f 14)
echo "preemptible = ${preemptible}"

status=$(echo ${data} | cut -d ' ' -f 16)
echo "status = ${status}"

3 Comments

Hi Antonio, thank you for the input. The 14 and 16 values are the white spaces count right? Is there any more reliable way to extract these strings? The reason I ask is because the config of VMs can vary and one can have more spaces than other configs within the data variable.
Tte cut command splits the string by spaces, so the value 14 represents the 14th field, thus preemptible.
Note that echo ${data} is itself buggy. If you have value: *, you're going to get that * replaced with a list of filenames in your current directory. Always quote your expansions: echo "$data" -- curly braces are entirely optional, quotes are not.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.