0

I have a shell script to read the data from a YAML file and then do some processing. This is how the YAML file is -

view:
    schema1.view1:/some-path/view1.sql
    schema2.view2:/some-path/view2.sql
tables:
    schema1.table1:/some-path/table1.sql
    schema2.table2:/some-path/table2.sql
end

I want the output as -

schema:schema1
object:view1
fileloc:/some-path/view1.sql

schema:schema2
object:view2
fileloc:/some-path/view2.sql

schema:schema1
object:table1
fileloc:/some-path/table1.sql

schema:schema2
object:table2
fileloc:/some-path/table2.sql

This is how I'm reading the YAML file using the shell script -

#!/bin/bash

input=./file.yaml

viewData=$(sed '/view/,/tables/!d;/tables/q' $file|sed '1d;$d')
tableData=$(sed '/tables/,/end/!d;/end/q' $file|sed '1d;$d')

so viewData will have this data -

schema1.view1:/some-path/view1.sql
schema2.view2:/some-path/view2.sql

and tableData will have this data -

schema1.table1:/some-path/table1.sql
schema2.table2:/some-path/table2.sql

And then I'm using a for loop to separate the schema, object and SQL file -

for line in $tableData; do
        field=`echo $line | cut -d: -f1`
        schema=`echo $field | cut -d. -f1`
        object=`echo $field | cut -d. -f2`
        fileLoc=`echo $line | cut -d: -f2`

        echo "schema=$schema"
        echo "object=$object"
        echo "fileloc=$fileLoc"
done

But I'll have to do the same thing again for the view. Is there any way in shell script like using an array or something else so that I can use the same loop to get data for both view and tables.

Any help would be appreciated. Thanks!

1 Answer 1

1

Using (g)awk:

awk -F "[:.]" '/:$/{ s=$1 }{ gsub(" ",""); if($3!=""){ print "schema="$1; print "object="$2; print "fileloc="$3 }}' yaml
  • -F "[:.]" reads input, and separates this on : or . (But using the regular expression [:.].)
  • /:$/{ s=$1 } This will store the group (view or tables) you are currently reading. This is not used anymore, so can be ignored.
  • gsub(" ",""); This will delete all spaced in the input line.
  • if... When you have three fields, checked by a not empty third field, print the info.

output:

schema=schema1
object=view1
fileloc=/some-path/view1
schema=schema2
object=view2
fileloc=/some-path/view2
schema=schema1
object=table1
fileloc=/some-path/table1
schema=schema2
object=table2
fileloc=/some-path/table2

EDIT: Adding the objectType to the output:

awk -F "[:.]" '/:$/{ s=$1 }{ gsub(" ",""); if($3!=""){ print "objectType="$s; "schema="$1; print "object="$2; print "fileloc="$3 }}' yaml

But I do see that I made a mistake.... 🤔😉

I would have expected the regular expression /:$/ to find a line that end with a :, but for some reason it does not. (I will have to do some more research to look into that)

It should be, for a working work-around:

awk -F "[:.]" 'NF==2{ s=$1 }NF>2{ gsub(" ",""); if($3!=""){ print "objectType="s; "sch
ema="$1; print "object="$2; print "fileloc="$3 }}' yaml
  • The line with view: has two field, which make NF return the value 2, and view is stored in the variable s.
  • When we have more than two fields, the contents of the variables is printed.
Sign up to request clarification or add additional context in comments.

5 Comments

Actually I need to loop the output values in another for loop.
From your question it is unclear what is stopping you from looping over the output of this.
sorry, please bear with me...this is kind of new for me. can you please tell me how do I loop over the output values? the same loop should work for table and view in separate iteration.
what if I want the object type as well in the output? Like, objectType=view schema=schema1 object=view1 fileloc=/some-path/view1.sql Please help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.