I have a large xml file that contains the details of image annotations. A sample of the same is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<dataset>
<name>dataset containing bounding box labels on images</name>
<comment>created by BBTag</comment>
<tags>
<tag name="ScoreBoard-Vivon" color="#bf5786"/>
<tag name="Perimeter-Vivon" color="#032585"/>
</tags>
<images>
<image file="/var/www/html/beacon.com/resources/videos/ST2_20170812/ST_2_20170812-0005.jpg">
<box top="253" left="166" width="56" height="24">
<label>Perimeter-Vivon</label>
</box>
<box top="255" left="229" width="61" height="21">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="290" width="58" height="23">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="361" width="56" height="20">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="417" width="63" height="22">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="486" width="63" height="20">
<label>Perimeter-Vivon</label>
</box>
<box top="504" left="329" width="51" height="29">
<label>ScoreBoard-Vivon</label>
</box>
</image>
</images>
</dataset>
I want this file to be split based on their tag names. This file has two tags viz - ScoreBoard and Perimeter. I want to create two different xmls out of this for each tag. The desired output would be as follows:
for ScoreBoard-Vivon.xml
<?xml version="1.0" encoding="UTF-8"?>
<dataset>
<name>dataset containing bounding box labels on images</name>
<comment>created by BBTag</comment>
<tags>
<tag name="ScoreBoard-Vivon" color="#bf5786"/>
</tags>
<images>
<image file="/var/www/html/beacon.com/resources/videos/ST2_20170812/ST_2_20170812-0005.jpg">
<box top="504" left="329" width="51" height="29">
<label>ScoreBoard-Vivon</label>
</box>
</image>
</images>
</dataset>
For Perimeter-Vivon.xml
<?xml version="1.0" encoding="UTF-8"?>
<dataset>
<name>dataset containing bounding box labels on images</name>
<comment>created by BBTag</comment>
<tags>
<tag name="Perimeter-Vivon" color="#032585"/>
</tags>
<images>
<image file="/var/www/html/beacon.com/resources/videos/ST2_20170812/ST_2_20170812-0005.jpg">
<box top="253" left="166" width="56" height="24">
<label>Perimeter-Vivon</label>
</box>
<box top="255" left="229" width="61" height="21">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="290" width="58" height="23">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="361" width="56" height="20">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="417" width="63" height="22">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="486" width="63" height="20">
<label>Perimeter-Vivon</label>
</box>
</image>
</images>
</dataset>
I have 350-400 such tags. How can I split them into individual files.
New Example:
<?xml version="1.0" encoding="UTF-8"?>
<dataset>
<name>dataset containing bounding box labels on images</name>
<comment>created by BBTag</comment>
<tags>
<tag name="Perimeter-SVT" color="#f9e99c"/>
<tag name="Perimeter-Vivon" color="#032585"/>
<tag name="ScoreBoard-Vivon" color="#bf5786"/>
<tag name="Perimeter-StarSports" color="#12dadd"/>
</tags>
<images>
<image file="/var/www/html/tamsports.com/resources/videos/STAR_SPORTS_2_20170812/STAR_SPORTS_2_20170812-0011.jpg">
<box top="505" left="327" width="56" height="29">
<label>ScoreBoard-Vivon</label>
</box>
<box top="218" left="387" width="67" height="24">
<label>Perimeter-SVT</label>
</box>
</image>
<image file="/var/www/html/tamsports.com/resources/videos/STAR_SPORTS_2_20170812/STAR_SPORTS_2_20170812-0005.jpg">
<box top="254" left="159" width="64" height="23">
<label>Perimeter-Vivon</label>
</box>
<box top="255" left="225" width="61" height="20">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="285" width="63" height="23">
<label>Perimeter-Vivon</label>
</box>
<box top="253" left="357" width="58" height="24">
<label>Perimeter-Vivon</label>
</box>
<box top="254" left="424" width="56" height="25">
<label>Perimeter-Vivon</label>
</box>
<box top="256" left="484" width="65" height="23">
<label>Perimeter-Vivon</label>
</box>
<box top="507" left="326" width="58" height="26">
<label>ScoreBoard-Vivon</label>
</box>
</image>
<image file="/var/www/html/tamsports.com/resources/videos/STAR_SPORTS_2_20170812/STAR_SPORTS_2_20170812-0009.jpg">
<box top="249" left="400" width="59" height="29">
<label>Perimeter-StarSports</label>
</box>
</image>
</images>
</dataset>