How to replace string using regex in shell script

Question

I am working on a script to extract data into text file (fossa_results.txt) through curl command and the extracted response will be as below

 "license_count": 32,
    "dependency_count": 295,
    "todo_count": 9,
    "unresolved_issue_count": 6,
    "unresolved_licensing_issue_count": 2,
    "unresolved_security_issue_count": 4,
    "unresolved_quality_issue_count": 0,

the above response is written to a text file (fossa_results.txt) and I am trying to perform replace string operation on that file using sed command and regex pattern and the expected outcome is as below and write back to same file (fossa_results.txt)

License Count= 32
Dependency Count= 295
Todo Count= 9
Unresolved Issue Count= 6
Unresolved Licensing Issue Count= 2
Unresolved Security Issue Count= 4
Unresolved Quality Issue Count= 0

Below is the script I have used for this.

sed -i 's/^[[:space:]]*//' fossa_results.txt -- trying to remove leading spaces
        sed -i 's/[[:space:]]*$//' fossa_results.txt -- trying to remove trailing spaces
        sed -i 's/"\""/""/g' fossa_results.txt -- trying to replace "
        sed -i 's/"\"\\[.*?\\]: "/""/g' fossa_results.txt - trying to remove any unwanted string that comes within [] like date.
        sed -i 's/"\"\\[.*?\\]"/""/g' "fossa_results.txt"
        sed -i 's/"\"license_count:"/"License Count="/g' "fossa_results.txt"
        sed -i 's/"\"todo_count:"/"Todo Count="/g' "fossa_results.txt"
        sed -i 's/"\"  dependency_count:"/"Dependancy Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_issue_count:"/"Unresolved Issue Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_licensing_issue_count:"/"Unresolved Licensing Issue Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_security_issue_count:"/"Unresolved Security Issue Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_quality_issue_count:"/"Unresolved Quality Issue Count="/g' "fossa_results.txt"
        fossaresults="$(cat fossa_results.txt)"

but when I print fossa_results.txt through cat command it printing the original data and it seems like replace is not working.

Assuming there are no cr's in your text file, something like: sed 's/^[[:blank:]]*//;s/[",]*//g;s/_/ /g;s/\w\+/\L\u&/g;s/:/=/' fossa_results.txt — Jetchisel
– Jetchisel, Commented Dec 23, 2022 at 11:57
Running sed -i on the same file repeatedly is an antipattern. You want to replace curl >file; sed -i xxx file; sed -i yyy file with simply curl | sed -e xxx -e yyy >file — tripleee
– tripleee, Commented Dec 23, 2022 at 11:58
If the input is actually JSON, use a JSON tool like jq to process it. — tripleee
– tripleee, Commented Dec 23, 2022 at 11:59
Questions that ask "please help me" tend to be looking for highly localized guidance, or in some cases, ongoing or private assistance, which is not suited to our Q&A format. It is also rather vague, and is better replaced with a more specific question. Please read Why is "Can someone help me?" not an actual question?. — halfer
– halfer, Commented Jan 2, 2023 at 10:19

j_b · Accepted Answer · 2022-12-23 13:10:54Z

An awk alternative:

 awk '{ gsub(":","="); gsub(/^ *|\"|,/,""); gsub("_"," "); for (i=1; i<=NF; ++i) { $i=toupper(substr($i,1,1)) tolower(substr($i,2)); }}1' src.dat
License Count= 32
Dependency Count= 295
Todo Count= 9
Unresolved Issue Count= 6
Unresolved Licensing Issue Count= 2
Unresolved Security Issue Count= 4
Unresolved Quality Issue Count= 0

replace all colons with equal sign gsub(":","=");

replace leading spaces or double quotes or commas with empty string gsub(/^ *|\"|,/,"");

replace underscore with single space gsub("_"," ");

capitalize the first letter of each field for (i=1; i<=NF; ++i) { $i=toupper(substr($i,1,1)) tolower(substr($i,2)); }}1'

Input file src.dat contents:

"license_count": 32,
    "dependency_count": 295,
    "todo_count": 9,
    "unresolved_issue_count": 6,
    "unresolved_licensing_issue_count": 2,
    "unresolved_security_issue_count": 4,
    "unresolved_quality_issue_count": 0,

Itération 122442 · Accepted Answer · 2022-12-23 12:47:25Z

This answer is off topic, because it proposes an awk solution instead of sed or bash as tagged (but it can still help).

You can use awk to format the content of the file correctly.

d.txt

 "license_count": 32,
    "dependency_count": 295,
    "todo_count": 9,
    "unresolved_issue_count": 6,
    "unresolved_licensing_issue_count": 2,
    "unresolved_security_issue_count": 4,
    "unresolved_quality_issue_count": 0,

a.awk

BEGIN {
    FS=":"
}
{
    gsub("\"","",$1)
    gsub(" ","",$1)
    gsub(",","",$2)
    print $1"="$2
}

Usage

awk -f a.awk d.txt

Output

license_count= 32
dependency_count= 295
todo_count= 9
unresolved_issue_count= 6
unresolved_licensing_issue_count= 2
unresolved_security_issue_count= 4
unresolved_quality_issue_count= 0

sseLtaH · Accepted Answer · 2022-12-23 13:26:22Z

1

Using GNU sed

$ sed -Ei.bak ':a;s/ +?([^:]*)_/\1 /;ta;s/:/=/;s/[",]//g;s/[a-z]+/\u&/g' input_file
License Count= 32
Dependency Count= 295
Todo Count= 9
Unresolved Issue Count= 6
Unresolved Licensing Issue Count= 2
Unresolved Security Issue Count= 4
Unresolved Quality Issue Count= 0

answered Dec 23, 2022 at 13:26

sseLtaH

11.3k5 gold badges17 silver badges34 bronze badges

Collectives™ on Stack Overflow

How to replace string using regex in shell script

3 Answers 3

Comments

d.txt

a.awk

Usage

Output

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

d.txt

a.awk

Usage

Output

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related