0

I am working on a script to extract data into text file (fossa_results.txt) through curl command and the extracted response will be as below

 "license_count": 32,
    "dependency_count": 295,
    "todo_count": 9,
    "unresolved_issue_count": 6,
    "unresolved_licensing_issue_count": 2,
    "unresolved_security_issue_count": 4,
    "unresolved_quality_issue_count": 0,

the above response is written to a text file (fossa_results.txt) and I am trying to perform replace string operation on that file using sed command and regex pattern and the expected outcome is as below and write back to same file (fossa_results.txt)

License Count= 32
Dependency Count= 295
Todo Count= 9
Unresolved Issue Count= 6
Unresolved Licensing Issue Count= 2
Unresolved Security Issue Count= 4
Unresolved Quality Issue Count= 0

Below is the script I have used for this.

sed -i 's/^[[:space:]]*//' fossa_results.txt -- trying to remove leading spaces
        sed -i 's/[[:space:]]*$//' fossa_results.txt -- trying to remove trailing spaces
        sed -i 's/"\""/""/g' fossa_results.txt -- trying to replace "
        sed -i 's/"\"\\[.*?\\]: "/""/g' fossa_results.txt - trying to remove any unwanted string that comes within [] like date.
        sed -i 's/"\"\\[.*?\\]"/""/g' "fossa_results.txt"
        sed -i 's/"\"license_count:"/"License Count="/g' "fossa_results.txt"
        sed -i 's/"\"todo_count:"/"Todo Count="/g' "fossa_results.txt"
        sed -i 's/"\"  dependency_count:"/"Dependancy Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_issue_count:"/"Unresolved Issue Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_licensing_issue_count:"/"Unresolved Licensing Issue Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_security_issue_count:"/"Unresolved Security Issue Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_quality_issue_count:"/"Unresolved Quality Issue Count="/g' "fossa_results.txt"
        fossaresults="$(cat fossa_results.txt)"

but when I print fossa_results.txt through cat command it printing the original data and it seems like replace is not working.

4
  • Assuming there are no cr's in your text file, something like: sed 's/^[[:blank:]]*//;s/[",]*//g;s/_/ /g;s/\w\+/\L\u&/g;s/:/=/' fossa_results.txt Commented Dec 23, 2022 at 11:57
  • 1
    Running sed -i on the same file repeatedly is an antipattern. You want to replace curl >file; sed -i xxx file; sed -i yyy file with simply curl | sed -e xxx -e yyy >file Commented Dec 23, 2022 at 11:58
  • 1
    If the input is actually JSON, use a JSON tool like jq to process it. Commented Dec 23, 2022 at 11:59
  • Questions that ask "please help me" tend to be looking for highly localized guidance, or in some cases, ongoing or private assistance, which is not suited to our Q&A format. It is also rather vague, and is better replaced with a more specific question. Please read Why is "Can someone help me?" not an actual question?. Commented Jan 2, 2023 at 10:19

3 Answers 3

1

An awk alternative:

 awk '{ gsub(":","="); gsub(/^ *|\"|,/,""); gsub("_"," "); for (i=1; i<=NF; ++i) { $i=toupper(substr($i,1,1)) tolower(substr($i,2)); }}1' src.dat
License Count= 32
Dependency Count= 295
Todo Count= 9
Unresolved Issue Count= 6
Unresolved Licensing Issue Count= 2
Unresolved Security Issue Count= 4
Unresolved Quality Issue Count= 0

replace all colons with equal sign gsub(":","=");

replace leading spaces or double quotes or commas with empty string gsub(/^ *|\"|,/,"");

replace underscore with single space gsub("_"," ");

capitalize the first letter of each field for (i=1; i<=NF; ++i) { $i=toupper(substr($i,1,1)) tolower(substr($i,2)); }}1'

Input file src.dat contents:

"license_count": 32,
    "dependency_count": 295,
    "todo_count": 9,
    "unresolved_issue_count": 6,
    "unresolved_licensing_issue_count": 2,
    "unresolved_security_issue_count": 4,
    "unresolved_quality_issue_count": 0,
Sign up to request clarification or add additional context in comments.

Comments

1

This answer is off topic, because it proposes an awk solution instead of sed or bash as tagged (but it can still help).

You can use awk to format the content of the file correctly.

d.txt

 "license_count": 32,
    "dependency_count": 295,
    "todo_count": 9,
    "unresolved_issue_count": 6,
    "unresolved_licensing_issue_count": 2,
    "unresolved_security_issue_count": 4,
    "unresolved_quality_issue_count": 0,

a.awk

BEGIN {
    FS=":"
}
{
    gsub("\"","",$1)
    gsub(" ","",$1)
    gsub(",","",$2)
    print $1"="$2
}

Usage

awk -f a.awk d.txt 

Output

license_count= 32
dependency_count= 295
todo_count= 9
unresolved_issue_count= 6
unresolved_licensing_issue_count= 2
unresolved_security_issue_count= 4
unresolved_quality_issue_count= 0

Comments

1

Using GNU sed

$ sed -Ei.bak ':a;s/ +?([^:]*)_/\1 /;ta;s/:/=/;s/[",]//g;s/[a-z]+/\u&/g' input_file
License Count= 32
Dependency Count= 295
Todo Count= 9
Unresolved Issue Count= 6
Unresolved Licensing Issue Count= 2
Unresolved Security Issue Count= 4
Unresolved Quality Issue Count= 0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.