0

I have this HTML table:

<?xml version="1.0" encoding="UTF-8"?>
<table class="names">
  <tbody>
    <tr class="names">
      <td>
        <p><strong class="strong">name</strong></p>
      </td>
      <td>
        <p><strong class="strong">surname</strong></p>
      </td>
      <td>
        <p><strong class="strong">aff</strong></p>
      </td>
    </tr>
    <tr class="names">
      <td>
        <p><span class="contrib">John</span></p>
      </td>
      <td>
        <p><span class="contrib">Smith</span></p>
      </td>
      <td>
        <p><span class="contrib">1,3</span></p>
      </td>
    </tr>
    <tr class="names">
      <td>
        <p><span class="contrib">Michael</span></p>
      </td>
      <td>
        <p><span class="contrib">Jordan</span></p>
      </td>
      <td>
        <p><span class="contrib">1,2</span></p>
      </td>
    </tr>
  </tbody>
</table>

I would like to transform it to structured XML elements like this:

<contrib>
  <person>
    <name>John</name>
    <surname>Smith</surname>
    <number>1</number>
    <number>3</number>   
  </person>
  <person>
    <name>Michael</name>
    <surname>Jordan</surname>
    <number>1</number>
    <number>2</number>
  </person>
</contrib>

And I created this XSLT so far:

  <xsl:template name="article-meta">
    <contrib>
      <person>
        <name>
          <xsl:value-of select=".//td[1]//span[@class='contrib']"/>
        </name>
        <surname>
          <xsl:value-of select=".//td[2]//span[@class='contrib']"/>
        </surname>
          <xsl:for-each select="//td[3]//span[@class='contrib']">
            <number><xsl:value-of select="normalize-space(.)"/></number>
          </xsl:for-each>
      </person>
    </contrib>
  </xsl:template>

I've been playing the whole day, but it seems I'm unable to produce multiple xml blocks. I'm always getting all results inside single element. Is it even possible somehow to create the wanted structure above if all <span> elements inside <td> cells only have class "contrib" and nothing else? Also, the last cell should be tokenized I believe, but I also don't know how to address it.

2
  • Please make it habit to tag your questions with the highest XSLT version you can use, in addition to the xslt tag. Commented Sep 27, 2020 at 17:02
  • I will keep that in mind for future. If it helps, I can use 2.0 or 3.0 for this case. Commented Sep 27, 2020 at 17:23

1 Answer 1

1

Is there a reason why you cannot do simply:

XSLT 2.0

<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/table">
    <contrib>
        <xsl:for-each select="tbody/tr[position() > 1]">
            <person>
                <name>
                    <xsl:value-of select="td[1]"/>
                </name>
                <surname>
                    <xsl:value-of select="td[2]"/>
                </surname>
                <xsl:for-each select="tokenize(td[3], ',')">
                    <number>
                        <xsl:value-of select="."/>
                    </number>
                </xsl:for-each>
            </person>
        </xsl:for-each>
    </contrib>    
</xsl:template>

</xsl:stylesheet>
Sign up to request clarification or add additional context in comments.

3 Comments

This seems doable, but there indeed is a reason why in this particular case maybe I won't be able to use it. There are multiple tables inside the same html document with the same class "names", but only this one has span class "contrib" inside. Is there maybe an alternative to solve this as well?
You could handle this easily by selecting/matching table[.//span/@class='contrib']. I won't adjust my answer because the code in your question does not represent this situation.
Ok, this is simply amazing. It works exactly as expected. I didn't know there's an option to address (sub)classes like this as well. Thank you so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.