1

My issue should be rather simple, and I'm guessing where I'm failing is syntax related. I'm looking to write a simple shell script which manipulates an XML file in the following way.

The data points inside each tag will be numeric values. These should all be positive, so if there exists a negative value inside any tag within the XML file, I would like to replace it with a zero. For example, the following XML file - call it "file.xml"

<tag1>19</tag1>
<tag2>2</tag2>
<tag3>-12</tag3>
<tag4>37</tag4>
<tag5>-41</tag5>

should be replaced with

<tag1>19</tag1>
<tag2>2</tag2>
<tag3>0</tag3>
<tag4>37</tag4>
<tag5>0</tag5>     

My thinking on this would be if I grepped any instance of the string ">-*<" in the file and used sed to replace it with >0< as follows.

#!/bin/bash
STRING=">-*<"
if grep -xq "$STRING" file.xml
then
sed -i 's/$STRING/>0</g' file.xml
else
echo "that string was not found in the file"
fi

However all I'm getting in return is the echo string "that string was not found in the file" being returned, even tho I have included negative values in the file. Does the * not take into account any string following the minus sign in this example? Naturally there can be any number following the minus sign, so i'm thinking my problem is how I've defined the variable: STRING=">-*<".... Any pointers in the right direction would be greatly appreciated. Thanks in advance.

3 Answers 3

2

Try this:

STRING=">-.*<"

or better yet:

STRING=">-[0-9]*<"

In general, the * means 'any number of the last character/class of characters', so .* matches any string, [0-9]* any string consisting only of digits. Your expression would have matched '><', '>-<', '>--<', '>---<' and so on.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the response @jomuel. You were spot on. The "-xq" option was also messing me up. I replaced this with just "-q" and it worked fine. I'll accept your answer now, and thanks again for the help.
2
cat aaa.txt
<tag1>19</tag1>
<tag2>2</tag2>
<tag3>-12</tag3>
<tag4>37</tag4>
<tag5>-41</tag5>

replace negatives with zero and write to a new file

sed 's/-[0-9][0-9]*/0/' aaa.txt > a2.txt

check the new file

cat a2.txt
<tag1>19</tag1>
<tag2>2</tag2>
<tag3>0</tag3>
<tag4>37</tag4>
<tag5>0</tag5>

1 Comment

Thanks for the response @bobo. Yep that definitely works alright. I have the code working fine now. Thanks again for taking the time to respond.
1
>-[0-9]+<

is a better choice as it is one or more chars

3 Comments

Is there any instance where >-.*< might fail?
yes, if you have any other xml included in your file that has text starting with dashes.
Of course. In this particular instance there will only ever be numeric values in the xml file so I should be safe, but point duly noted. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.