0

My source xml is:

<?xml version="1.0" encoding="UTF-8"?>
<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.dmg.org/PMML-4_1 pmml-4-1.xsd">
<Header copyright="(C) Copyright IBM Corp. 1989, 2014.">
    <Application name="IBM SPSS Statistics 23.0" version="23.0.0.0"/>
</Header>
<GeneralRegressionModel algorithmName="multinomialLogistic" functionName="classification" modelType="multinomialLogistic" targetVariableName="CLASS">
    <MiningSchema>
        <MiningField missingValueTreatment="asIs" name="CLASS" usageType="predicted"/>
        <MiningField missingValueTreatment="asIs" name="ACTIVE_CUSTOMER" usageType="active"/>
        <MiningField missingValueTreatment="asIs" name="SEGMENT" usageType="active"/>
    </MiningSchema>
    <ParameterList>
        <Parameter label="Konstanter Term" name="P0000001"/>
        <Parameter label="[ACTIVE_CUSTOMER=0]" name="P0000002"/>
        <Parameter label="[ACTIVE_CUSTOMER=1]" name="P0000003"/>
        <Parameter label="[SEGMENT=0]" name="P00000004"/>
        <Parameter label="[SEGMENT=1]" name="P00000005"/>
    </ParameterList>
    <ParamMatrix>
        <PCell beta="-167.307903919999" df="1" parameterName="P0000001" targetCategory="1"/>
        <PCell beta="-0.0747629275586869" df="1" parameterName="P0000002" targetCategory="1"/>
        <PCell beta="0.409965797830495" df="1" parameterName="P0000003" targetCategory="1"/>
        <PCell beta="-1.03190717557433" df="1" parameterName="P0000004" targetCategory="1"/>
        <PCell beta="0.904157514089376" df="1" parameterName="P0000005" targetCategory="1"/>
    </ParamMatrix>
</GeneralRegressionModel>
</PMML>

My output xml is:

<?xml version="1.0" encoding="utf-8"?>
<Predictors xmlns:ns="some:ns" xmlns:rs="http://www.dmg.org/PMML-4_1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Predictor coefficient="-167.307903919999" name="__INTERCEPT__" value=""/>
  <Predictor coefficient="-0.0747629275586869" name="ACTIVE_CUSTOMER" value="0"/>
  <Predictor coefficient="0.409965797830495" name="ACTIVE_CUSTOMER" value="1"/>
  <Predictor coefficient="" name="SEGMENT" value="0"/>
  <Predictor coefficient="" name="SEGMENT" value="1"/>
</Predictors>

I could achieve this with the following xslt:

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:sap="http://www.sap.com/sapxsl" xmlns:ns="some:ns" xmlns:rs="http://www.dmg.org/PMML-4_1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0">
  <xsl:output encoding="utf-8" indent="yes" method="xml"/>
  <xsl:strip-space elements="*"/>

  <xsl:key match="rs:ParamMatrix/rs:PCell" name="cell" use="@parameterName"/>
  <xsl:key match="rs:DataDictionary/rs:DataField" name="dataField" use="@name"/>

  <!-- identity transform -->
  <xsl:template match="node()|@*">
     <xsl:apply-templates select="node()|@*"/>
  </xsl:template>

  <xsl:template match="rs:GeneralRegressionModel">
    <!--MiningSchema-->
    <xsl:apply-templates select="rs:MiningSchema"/>

    <!--RegressionTable for predicted targetVariable targetCategory-->
    <Predictors>
      <xsl:apply-templates select="rs:ParameterList/rs:Parameter"/>
    </Predictors>

  </xsl:template>

    <xsl:template match="rs:Parameter[not(contains(@label, '='))][@name='P0000001']">
    <Predictor coefficient="{key('cell', @name)/@beta}" name="__INTERCEPT__" value=""/>
  </xsl:template>

  <xsl:template match="rs:Parameter[not(contains(@label, '='))][@name!='P0000001']">
    <Predictor coefficient="{key('cell', @name)/@beta}" name="{@label}" value=""/>
  </xsl:template>

  <xsl:template match="rs:Parameter[contains(@label, '=')]" name="split">
    <Predictor coefficient="{key('cell', @name)/@beta}" name="{substring-after(substring-before(@label,'='),'[')}" value="{substring-before(substring-after(@label,'='),']')}"/>
  </xsl:template>

</xsl:transform>

This XSLT works. However, I have 2 issues: 1. at the beginning of the source xml, there is namespace, such as 'xmlns="http://www.dmg.org/PMML-4_1"', could be other value. The whole document uses only this one namespace. Currently in my xslt, I set namespace as fixed value 'xmlns:rs="http://www.dmg.org/PMML-4_1" ', this is not correct. How do I set namespace dynamically in xslt?

  1. once I set namespace in xslt, it shows up in the output xml as well. How do I remove this namespace from output xml?

If it´s okay, could you please modify my xslt directly to show me the usage?

Many thanks!!!

7
  • 1
    "at the beginning of the source xml, there is namespace, such as 'xmlns="dmg.org/PMML-4_1"', could be other value." Can you explain how exactly that works? A namespace is part of the XML schema - it is not supposed to change arbitrarily. Do you at least have a "bank" of possible namespaces? -- Re your 2nd question: use exclude-result-prefixes="rs". And remove the identity transform template: you're not copying anything from the source XML, and you don't want to copy anything from the source XML - otherwise you'll be copying its namespace too. Commented Jun 26, 2016 at 7:49
  • Note that the template your comment identifies as identity transform is not the identity transform as it does not use xsl:copy but rather only <xsl:apply-templates select="node()|@*"/>. As for the problem about the namespace being dynamic, with an XSLT 2.0 processor you could use *:foo, e.g. <xsl:template match="*:GeneralRegressionModel"><xsl:apply-templates select="*:MiningSchema"/>..., but it agree that this requirement of an arbitrary namespace sounds odd. Commented Jun 26, 2016 at 9:03
  • @michael.hor257k I am processing XML`s generated by a statistic software. If the XML is sent to me by someone using a older version of the software, the namespace could be 'xmlns="dmg.org/PMML-4_0"', with a new version of the software, then it could be 'xmlns="dmg.org/PMML-4_1"'. The requirement to me is that, no matter which version of the software, the xslt should work. Commented Jun 26, 2016 at 13:47
  • @michael.hor257k I simplied both source xml and target xml to make it easier the post as question here. I do need the identity tranform to copy some parts of the source xml. Commented Jun 26, 2016 at 14:14
  • @michael.hor257k @Martin Honnen I don`t want to ignore namespace. I still work with XSLT 1.0. I am using this xslt inside my program. If I could parse the source xml file to get the namespace, can I use this namespace as input parameter in the xslt? If yes, do you think this is a feasible approach, and do you mind show me how? ---- Many many thanks Commented Jun 26, 2016 at 14:20

4 Answers 4

1

To create an element in a namespace that is not known until run-time:

(a) change any literal result element such as <Predictor/> to <xsl:element name="Predictor" namespace="{$ns}'/>

(b) change any use of <xsl:copy/> to <xsl:element name="{local-name()}" namespace="{$ns}'/>

(c) change any use of <xsl:copy-of/> to a recursive copy using a modified identity template using <xsl:element/> as above.

Alternatively, talk to the people who control this XML vocabulary and ask them why they are misusing namespaces in this way.

Sign up to request clarification or add additional context in comments.

1 Comment

I believe they are describing a situation where the source namespace is not known in advance.
0

If you apply the following stylesheet to your input XML:

XSLT 1.0 (a)

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="*[namespace-uri()=/*/namespace::*[not(name())]]">
    <xsl:element name="{local-name()}" namespace="urn:x-my:constant-namespace">
        <xsl:apply-templates select="@*|node()"/>
    </xsl:element>
</xsl:template>

</xsl:stylesheet>

it will move all the elements in the (unknown) default namespace to a known and constant namespace urn:x-my:constant-namespace:

<?xml version="1.0" encoding="UTF-8"?>
<PMML xmlns="urn:x-my:constant-namespace" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="4.1" xsi:schemaLocation="http://www.dmg.org/PMML-4_1 pmml-4-1.xsd">
  <Header copyright="(C) Copyright IBM Corp. 1989, 2014.">
    <Application name="IBM SPSS Statistics 23.0" version="23.0.0.0"/>
  </Header>
  <GeneralRegressionModel algorithmName="multinomialLogistic" functionName="classification" modelType="multinomialLogistic" targetVariableName="CLASS">
    <MiningSchema>
      <MiningField missingValueTreatment="asIs" name="CLASS" usageType="predicted"/>
      <MiningField missingValueTreatment="asIs" name="ACTIVE_CUSTOMER" usageType="active"/>
      <MiningField missingValueTreatment="asIs" name="SEGMENT" usageType="active"/>
    </MiningSchema>
    <ParameterList>
      <Parameter label="Konstanter Term" name="P0000001"/>
      <Parameter label="[ACTIVE_CUSTOMER=0]" name="P0000002"/>
      <Parameter label="[ACTIVE_CUSTOMER=1]" name="P0000003"/>
      <Parameter label="[SEGMENT=0]" name="P00000004"/>
      <Parameter label="[SEGMENT=1]" name="P00000005"/>
    </ParameterList>
    <ParamMatrix>
      <PCell beta="-167.307903919999" df="1" parameterName="P0000001" targetCategory="1"/>
      <PCell beta="-0.0747629275586869" df="1" parameterName="P0000002" targetCategory="1"/>
      <PCell beta="0.409965797830495" df="1" parameterName="P0000003" targetCategory="1"/>
      <PCell beta="-1.03190717557433" df="1" parameterName="P0000004" targetCategory="1"/>
      <PCell beta="0.904157514089376" df="1" parameterName="P0000005" targetCategory="1"/>
    </ParamMatrix>
  </GeneralRegressionModel>
</PMML>

You can then apply a second stylesheet to the result, e.g.:

XSLT 1.0 (b)

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:rs="urn:x-my:constant-namespace" 
exclude-result-prefixes="rs">
<xsl:output encoding="utf-8" indent="yes" method="xml"/>
<xsl:strip-space elements="*"/>

<xsl:key match="rs:PCell" name="cell" use="@parameterName"/>

<xsl:template match="/">
    <Predictors>
          <xsl:apply-templates/>
    </Predictors>
</xsl:template>

<xsl:template match="rs:Parameter[not(contains(@label, '='))][@name='P0000001']">
    <Predictor coefficient="{key('cell', @name)/@beta}" name="__INTERCEPT__" value=""/>
</xsl:template>

<xsl:template match="rs:Parameter[not(contains(@label, '='))][@name!='P0000001']">
    <Predictor coefficient="{key('cell', @name)/@beta}" name="{@label}" value=""/>
</xsl:template>

<xsl:template match="rs:Parameter[contains(@label, '=')]">
    <Predictor coefficient="{key('cell', @name)/@beta}" name="{substring-after(substring-before(@label,'='),'[')}" value="{substring-before(substring-after(@label,'='),']')}"/>
</xsl:template>

</xsl:stylesheet>

and receive:

<?xml version="1.0" encoding="utf-8"?>
<Predictors>
  <Predictor coefficient="-167.307903919999" name="__INTERCEPT__" value=""/>
  <Predictor coefficient="-0.0747629275586869" name="ACTIVE_CUSTOMER" value="0"/>
  <Predictor coefficient="0.409965797830495" name="ACTIVE_CUSTOMER" value="1"/>
  <Predictor coefficient="" name="SEGMENT" value="0"/>
  <Predictor coefficient="" name="SEGMENT" value="1"/>
</Predictors>

Comments

0

If you're looking to make the xsl namespace agnostic so that you can arbitrarily alter the namespace of input xml, then you'll need to run the transform in two stages.

If you include the following into your xsl and remove all the namespace references - uri's and prefixes - from your xsl (apart from the ones you wish to see in the output)

<xsl:template match="@*" mode="stripNS">
    <xsl:attribute name="{local-name(.)}"><xsl:value-of select="."/></xsl:attribute>
</xsl:template>
<xsl:template match="node()" mode="stripNS">
    <xsl:element name="{local-name()}">
        <xsl:apply-templates select="node()|@*" mode="stripNS"/>
    </xsl:element>
</xsl:template>
<xsl:template match="/">
    <xsl:variable name="nakedXML">
        <xsl:apply-templates mode="stripNS"/>
    </xsl:variable>
    <xsl:apply-templates select="$nakedXML/*" />
</xsl:template>

The match on root will always be the initial template to execute regardless of the input namespace. The xml will then use <xsl:element> and <attribute> to create a representation of your input xml into the variable $nakedXML will all the namespaces strip away.

From this point you can the <apply-templates> against the nakedXML. Note, some xsl processors will require that you wrap $nakeXML with a suitable node-set() function - each processor handles it differently, so check you documentation.

I should add, I dont fully endorse this technique. It has a significant impact on performance and stripping out namespaces has the potential to create confusion later on. IMO, when content is written with namespaces it should always be refered to with that namespace.

5 Comments

"If you're looking to make the xsl namespace agnostic so that you can arbitrarily alter the namespace of input xml, then you'll need to run the transform in two stages." No, that's not necessary. -- I do however agree that ignoring the namespace is a hack and should be avoided if possible.
@michael.hor257k - hmm, now you've got me thinking. Surely, the NS-free match expression will not trigger while the input is bound to a namespace? I (yet) can see how it possible in one pass.
Read the comment by Martin Honnen above. There is an XSLT 1.0 equivalent, too.
So far as I can see, the xsl 1.0 equivalent is to match expresions such as this- *[name()='GeneralRegressionModel']. Not sure that I prefer this, as those expressions could get ugly, but it works in a single pass. but thanks @michael.hor257k for the mentioning it, I wouldn't have even tried this techique otherwise
@Phil B Thanks a lot for your replay. I don`t really want to ignore namespace. If I could first parse the xml to find out the namespace for this particular xml file, can I use this namespace as an input parameter? How?
0

Following your comment in my previous answer:

Whenever you need to obtain the namespaces for the current node you should traverse the namespace axis. If we assuming that all your docuemnt namespaces are declared in the root node you can use the xpath "/*/namespace::*" to obtain a node set of all namespaces.

So, for your example input, something like this...

  <xsl:for-each select="/*/namespace::*">
       <namespace prefix="{name()}" uri="{.}"/>
  </xsl:for-each>

would give you

<namespace prefix="" uri="http://www.dmg.org/PMML-4_1"/>
<namespace prefix="xsi" uri="http://www.w3.org/2001/XMLSchema-instance"/>
<namespace prefix="xml" uri="http://www.w3.org/XML/1998/namespace"/>

And if you just want the default namespace URI for the root node:

<xsl:value-of select="/*/namespace::*[not(name())]"/>

1 Comment

thanks for the help. I was really struggeling with the syntax though, until Michael posted the script above. ´match="*[namespace-uri()=/*/namespace::*[not(name())]]" ´, is it just me to find it difficult to understand? I will keep trying!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.