0

I have an excel as follows which has header data in 5th row.

enter image description here

EDIT :

The input excel may also appear as follows. The data may appear in any column. The data has to be identified using the row headers Ad Name, UID and Status.It wont change.

enter image description here

enter image description here

Which then saved as an work book xml as follows

<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
  <Author>Jefferson D</Author>
  <LastAuthor>Jefferson D</LastAuthor>
  <Created>2015-10-29T17:10:31Z</Created>
  <LastSaved>2015-10-29T17:15:02Z</LastSaved>
  <Company>*CL</Company>
  <Version>12.0</Version>
 </DocumentProperties>
 <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
  <AllowPNG/>
 </OfficeDocumentSettings>
 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
  <WindowHeight>22060</WindowHeight>
  <WindowWidth>34400</WindowWidth>
  <WindowTopX>-20</WindowTopX>
  <WindowTopY>-20</WindowTopY>
  <Date1904/>
  <ProtectStructure>False</ProtectStructure>
  <ProtectWindows>False</ProtectWindows>
 </ExcelWorkbook>
 <Styles>
  <Style ss:ID="Default" ss:Name="Normal">
   <Alignment ss:Vertical="Bottom"/>
   <Borders/>
   <Font ss:FontName="Verdana"/>
   <Interior/>
   <NumberFormat/>
   <Protection/>
  </Style>
  <Style ss:ID="s16">
   <Font ss:FontName="Verdana" ss:Bold="1"/>
  </Style>
 </Styles>
 <Worksheet ss:Name="Sheet1">
  <Table ss:ExpandedColumnCount="3" ss:ExpandedRowCount="10" x:FullColumns="1"
   x:FullRows="1">
   <Column ss:AutoFitWidth="0" ss:Width="176.0"/>
   <Column ss:AutoFitWidth="0" ss:Width="141.0"/>
   <Column ss:AutoFitWidth="0" ss:Width="152.0"/>
   <Row>
    <Cell ss:Index="2" ss:StyleID="s16"><Data ss:Type="String">Ad Report</Data></Cell>
   </Row>
   <Row ss:Index="3">
    <Cell><Data ss:Type="String">IssueNo: 1</Data></Cell>
   </Row>
   <Row>
    <Cell><Data ss:Type="String">IssueName: XXX</Data></Cell>
   </Row>
   <Row>
    <Cell><Data ss:Type="String">Issue Date: YYY</Data></Cell>
   </Row>
   <Row ss:StyleID="s16">
    <Cell><Data ss:Type="String">Ad Name</Data></Cell>
    <Cell><Data ss:Type="String">UID</Data></Cell>
    <Cell><Data ss:Type="String">Status</Data></Cell>
   </Row>
   <Row>
    <Cell><Data ss:Type="String">WWW</Data></Cell>
    <Cell><Data ss:Type="String">0A1</Data></Cell>
    <Cell><Data ss:Type="String">active</Data></Cell>
   </Row>
   <Row>
    <Cell><Data ss:Type="String">XXX</Data></Cell>
    <Cell><Data ss:Type="String">1B2</Data></Cell>
    <Cell><Data ss:Type="String">active</Data></Cell>
   </Row>
   <Row>
    <Cell><Data ss:Type="String">YYY</Data></Cell>
    <Cell><Data ss:Type="String">2C3</Data></Cell>
    <Cell><Data ss:Type="String">inactive</Data></Cell>
   </Row>
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
   <Print>
    <ValidPrinterInfo/>
    <PaperSizeIndex>10</PaperSizeIndex>
    <HorizontalResolution>-4</HorizontalResolution>
    <VerticalResolution>-4</VerticalResolution>
   </Print>
   <ShowPageLayoutZoom/>
   <PageLayoutZoom>100</PageLayoutZoom>
   <Selected/>
   <Panes>
    <Pane>
     <Number>3</Number>
     <ActiveRow>13</ActiveRow>
     <ActiveCol>2</ActiveCol>
    </Pane>
   </Panes>
   <ProtectObjects>False</ProtectObjects>
   <ProtectScenarios>False</ProtectScenarios>
  </WorksheetOptions>
 </Worksheet>
</Workbook>

I would like to extract some data from Excel xml file using XSLT2.0 and want to create a new xml as follows

<adverts>
   <advert>
      <advertName>WWW</advertName>
      <advertNumber>0A1</advertNumber>
      <advertStatus>active<advertStatus>
   </advert>
   <advert>
      <advertName>XXX</advertName>
      <advertNumber>1B2</advertNumber>
      <advertStatus>active<advertStatus>
   </advert>
   <advert>
      <advertName>YYY</advertName>
      <advertNumber>2C3</advertNumber>
      <advertStatus>inactive<advertStatus>
   </advert>
</adverts>

I am greatly confused because this is the first time I am dealing with workbook XML. Any guiding link is also appreciable.

2
  • 1
    Do you know the names of the result elements like advertName or advertNumber or you need to construct them from the data in the Excel sheet? Commented Oct 29, 2015 at 19:13
  • I already know the tag names. Need not extract from excel sheet. Text values only be extracted. Commented Oct 29, 2015 at 19:17

1 Answer 1

3

Edited in response of clarifications:

The data may appear in any column. The data has to be identified using the row headers Ad Name, UID and Status.

Try it this way;

XSLT 2.0

<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
exclude-result-prefixes="ss">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:variable name="rows" select="/ss:Workbook/ss:Worksheet/ss:Table/ss:Row" />
<xsl:variable name="header-row" select="$rows[ss:Cell/ss:Data='Ad Name'][1]" />
<xsl:variable name="header-row-num" select="index-of($rows, $header-row)" />

<xsl:variable name="header-row-cells" select="$header-row/ss:Cell" />
<xsl:variable name="name-col-num" select="index-of($header-row-cells, $header-row-cells[ss:Data='Ad Name'][1])" />
<xsl:variable name="number-col-num" select="index-of($header-row-cells, $header-row-cells[ss:Data='UID'][1])" />
<xsl:variable name="status-col-num" select="index-of($header-row-cells, $header-row-cells[ss:Data='Stattus'][1])" />

<xsl:template match="/ss:Workbook">
    <adverts>
        <xsl:apply-templates select="ss:Worksheet/ss:Table/ss:Row[position() gt $header-row-num]"/>
    </adverts>      
</xsl:template>

<xsl:template match="ss:Row">
    <advert>
        <advertName>
            <xsl:value-of select="ss:Cell[$name-col-num]/ss:Data"/>
        </advertName>
        <advertNumber>
            <xsl:value-of select="ss:Cell[$number-col-num]/ss:Data"/>
        </advertNumber>
        <advertStatus>
            <xsl:value-of select="ss:Cell[$status-col-num]/ss:Data"/>
        </advertStatus>
    </advert>
</xsl:template>

</xsl:stylesheet>

Applied to your XML input example, the result is:

<?xml version="1.0" encoding="UTF-8"?>
<adverts>
   <advert>
      <advertName>WWW</advertName>
      <advertNumber>0A1</advertNumber>
      <advertStatus>active</advertStatus>
   </advert>
   <advert>
      <advertName>XXX</advertName>
      <advertNumber>1B2</advertNumber>
      <advertStatus>active</advertStatus>
   </advert>
   <advert>
      <advertName>YYY</advertName>
      <advertNumber>2C3</advertNumber>
      <advertStatus>inactive</advertStatus>
   </advert>
</adverts>

Note:

  1. I have an excel as follows which has header data in 6th row.

    Contrary to what you say and show in your screenshot, the header row in your XML is actually the 5th row, not the 6th. The stylesheet above identifies the header row by the presence of a cell containing "Ad Name". If you do know the number in advance, you can simplify the stylesheet by using that number directly;

  2. In your XML, the column name is "Stattus' not "Status". Accordingly, the stylesheet above looks for "Stattus" in order to return the expected result when processing your XML example.

Sign up to request clarification or add additional context in comments.

3 Comments

I already tried the similar code. This does not work out if Ad Name data appears at B column, UID appears at A column. I would like to verify the header name first.
@Joe You need to clarify what exactly is known and what can change.
Yes. I am working with that code only. Its awesome. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.