I'm doing a html to xml transformation with XSLT. I find a quite a difficult task on html table (with merged cells) transform to xml.
Here is the scenario,
My input html table,
<table>
<thead>
<tr>
<td rowspan="3">Date</td>
<td colspan="5">Customer Price Index</td>
<td rowspan="3"> private consumption chain price </td>
<td colspan="2"> Other consumer price mesure </td>
</tr>
<tr>
<td rowspan="2"> All groups </td>
<td rowspan="2"> Excluding volatile items </td>
<td colspan="3">Market prices excluding volatile items</td>
<td colspan="2"> Based on seasonally adjusted quntity price changers </td>
</tr>
<tr>
<td>Goods</td>
<td>Services</td>
<td>Total</td>
<td> weihgted median </td>
<td>Trimmed mean</td>
</tr>
</thead>
<tbody>
<tr>
<td>2003/04</td>
<td colspan="8">content</td>
</tr>
<tr>
<td>Dec</td>
<td>2.4</td>
<td>2.4</td>
<td>1.6</td>
<td>2.2</td>
<td>1.8</td>
<td>1.0</td>
<td>2.0</td>
<td>2.5</td>
</tr>
</tbody>
</table>
desired xml output ,
<table>
<thead>
<row>
<data namest="1" morerows="2">
<p>Date</p>
</data>
<data namest="2" nameend="6">
<p>Consumer price index</p>
</data>
<data namest="7" morerows="2">
<p>Private consumption chain price index</p>
</data>
<data namest="8" nameend="9">
<p>Other consumer price mesure</p>
</data>
</row>
<row>
<data namest="2" morerows="1">
<p>All groups</p>
</data>
<data namest="3" morerows="1">
<p>Excluding volatile items</p>
</data>
<data namest="4" nameend="6">
<p>Market prices excluding volatile items</p>
</data>
<data namest="8" nameend="9">
<p>Based on seasonally adjusted quntity price changers</p>
</data>
</row>
<row>
<data namest="4">
<p>Goods</p>
</data>
<data namest="5">
<p>Services</p>
</data>
<data namest="6">
<p>Total</p>
</data>
<data namest="8">
<p>Weighted median</p>
</data>
<data namest="9">
<p>Trimmed mean</p>
</data>
</row>
</thead>
<tbody>
<row>
<data namest="1">
<p>2003/04</p>
</data>
<data namest="2" nameend="9">
<p>content</p>
</data>
</row>
<row>
<data namest="1">
<p>Dec</p>
</data>
<data namest="2">
<p>2.4</p>
</data>
<data namest="3">
<p>2.4</p>
</data>
<data namest="4">
<p>1.6</p>
</data>
<data namest="5">
<p>2.2</p>
</data>
<data namest="6">
<p>1.8</p>
</data>
<data namest="7">
<p>1.0</p>
</data>
<data namest="8">
<p>2.8</p>
</data>
<data namest="9">
<p>2.5</p>
</data>
</row>
</tbody>
</table>
As you can see vertical cell merging represent as rowspan attr and horizontal merging represent as colspan attr in the input html.
and in expected output namest attr represent the cell starting column number and morerows attr represent how many number of cell merge down (vertical) and nameend attr represent last cell column number (horizontal merge).
This scenario can be solved by another languages using data structures (two dimensional arrays) but I'm struggling to find a effective method to do this task using XSLT.
I wrote following xsl to do this task, and it works for the first row but for the other rows this method is too complicated.
<xsl:template match="td[parent::tr[not(preceding::tr)]]">
<xsl:variable name="pre_rowspan" select="number(format-number(count(preceding-sibling::td[@rowspan])+1, '#0', 'myformat'))"/>
<xsl:variable name="pre_colspan" select="number(format-number(preceding-sibling::td[@colspan]/@colspan, '#0', 'myformat'))"/>
<xsl:variable name="numberof_pre_rowspan" select="number(format-number(count(preceding-sibling::td[@rowspan])+1, '#0', 'myformat'))"/>
<data>
<xsl:attribute name="namest" select="number($pre_rowspan + $pre_colspan)"/>
<xsl:if test="@rowspan">
<xsl:attribute name="morerows" select="number(@rowspan)-1"/>
</xsl:if>
<xsl:if test="@colspan">
<xsl:attribute name="nameend" select="number(@colspan)+number(format-number(count(preceding-sibling::td[@rowspan]), '#0', 'myformat'))+number(format-number(number(preceding-sibling::td[@colspan]/@colspan), '#0', 'myformat'))"/>
</xsl:if>
<xsl:if test="@rowspan and @colspan">
<xsl:attribute name="nameend" select="$pre_rowspan"/>
</xsl:if>
<p>
<xsl:apply-templates/>
</p>
</data>
</xsl:template>
SO, Can anyone suggest me a method how can I do this task using xslt. (using data structure or any other method)