2

I have two files file a.) xmlFile.xml b.) emails.txt

xmlFile.xml has the following structure repeated multiple times

<gname>Office</gname>
<uname>person</uname>

emails.txt has list of email addresses

[email protected]
[email protected]
...

What I want to accomplish is to replace "person" in xmlFile.xml with subsequent value taken from emails.txt

I have tried

# while read email ; do sed  "s/person/$email/g" xmlFile.xml > xmlFile.new; done < emails.txt

However I endup with file that has all "person" values replaced with the last email from emails.txt

Thanks, Filip

1
  • 1
    You need to have sed only replace the first occurrence, then repeat once for each line. This will be slow, of course. The sane answer is, I believe, use Perl. :-P Alternatively, you could transform your emails.txt into a giant sed script and run it once. Commented Feb 25, 2011 at 19:35

3 Answers 3

3
awk 'NR==FNR{e[i++]=$0;next} /person/{sub("person",e[j++])}1' emails.txt xmlFile.xml

Explanation

  1. NR==FNR: This is only true when awk is reading the first file. It essentially tests total number of records seen (NR) vs the input record in the current file (FNR).
  2. e[i++]=$0: Create an array named e who's index increments by 1 (i++)and who's value is equal to the current record $0. This array will hold our emails
  3. next: Ignore the rest of the script if this is reached, start over with a new input record
  4. /person/: Only perform the subsequent code if the current record matches the regex "person"
  5. sub("person",e[j++]): Substitute the literal value "person" for a value in our array e that we created earlier. Increment this array j++ for the next record we match
  6. 1: Always returns true, essentially a shortcut for {print $0}, or output our current record

Proof Of Concept

$ cat emails.txt
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]

$ cat xmlFile.xml
<gname>Office</gname>
<uname>person</uname>
<gname>Office</gname>
<uname>person</uname>
<gname>Office</gname>
<uname>person</uname>
<gname>Office</gname>
<uname>person</uname>
<gname>Office</gname>
<uname>person</uname>
<gname>Office</gname>
<uname>person</uname>
<gname>Office</gname>
<uname>person</uname>
<gname>Office</gname>
<uname>person</uname>
<gname>Office</gname>
<uname>person</uname>

$ awk 'NR==FNR{e[i++]=$0;next} /person/{sub("person",e[j++])}1' emails.txt xmlFile.xml
<gname>Office</gname>
<uname>[email protected]</uname>
<gname>Office</gname>
<uname>[email protected]</uname>
<gname>Office</gname>
<uname>[email protected]</uname>
<gname>Office</gname>
<uname>[email protected]</uname>
<gname>Office</gname>
<uname>[email protected]</uname>
<gname>Office</gname>
<uname>[email protected]</uname>
<gname>Office</gname>
<uname>[email protected]</uname>
<gname>Office</gname>
<uname>[email protected]</uname>
<gname>Office</gname>
<uname>[email protected]</uname>

The above script assumes that person is a literal value. If it is not, then..

Replace: /person/{sub("person",emails[j++])}
With: /<uname>/{sub(".*","<uname>"emails[j++]"</uname>")}

Sign up to request clarification or add additional context in comments.

1 Comment

The solution seems perfect, but to be a good answer some explanation for awk novices should be given.
1

One way to accomplish this would be to use in-place editing:

while read email ; do sed -i "s/person/$email/;q" xmlFile.xml; done < emails.txt

If there's little or nothing more to the XML file than what you've show, just reconstruct it:

sed -e 'i <gname>Office</gname>' -e 's|.*|<uname>&</uname>|' emails.txt > newxmlFile.xml

without even touching the existing xmlFile.xml.

However, you should probably use an XML parser such as xmlstarlet.

1 Comment

Thanks Dennis, The first example resulted each "person" being changed to the first email in the emails.txt, however the second example did work. Thanks!
0

Here's how to do it using bash & xmlstarlet!

IFS=$'\n' read -r -d "" -a array < emails.txt                   # read file with email addresses into array
n=$(xmlstarlet sel -T -t -v "count(//uname)" -n xmlFile.xml)    # count "uname" nodes in XML file
xmlFileStr="$(< xmlFile.xml)"                                   # read XML file into variable


if [[ $n -eq ${#array[@]} ]]; then   # if the number of nodes & email addresses is equal ...
   for ((i=1; i <= ${n}; i+=1)); do
      xmlFileStr="$(printf '%s' "$xmlFileStr" | xmlstarlet ed -P -t -u "//uname[${i}]" -v "${array[$((i-1))]}")"
   done
fi

printf '%s\n' "$xmlFileStr" > xmlFile.xml
cat xmlFile.xml

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.