0

I am trying to parse an XML text. It is stored in a table t_testxml, in column xml_data which is CLOB type.

The xml looks like:

<?xml version="1.0" encoding="UTF-8"?>
<defaultmpftest:defaultmpftest xmlns:defaultmpftest="http://test.com"
 test_id = "1231"
 test_name = "name_test">
</mpftestdata:additionalLinkUrl xmlns:mpftestdata="http://test2.com"/>
</defaultmpftest:defaultmpftest>

How can I extract the values for test_id and test_name ?

I tried:

Select extract(xmltype.createxml(t.xml_data),'//defaultmpftest:defaultmpftest/@test_id').getStringVal() from t_testxml t;

But is not working. I get the following error:

ORA-31011: XML Parsing failed
LPX-00601: Invalid token in defaultmpftest:defaultmpftest/@test_id 

Can you please give me some advices on this matter ?

Thank you !

1
  • What you've posted as your raw XML is invalid, as pointed out in an answer, but if run as shown you'd get a different error - LPX-00231, before it has a chance to hit LPX-00601. Removing the stray / at the start of the inner node name would make the XML valid and would then give the error you showed. It's helpful to show what you are actually using. See how to create an MCVE. Commented Apr 21, 2017 at 9:45

3 Answers 3

3

The XML shown in the question is invalid, and would cause an "LPX-00231: invalid character" error if you passed it in to XMLType. So that isn't the string you're actually using. I'm assuming is a typo when posting the question, and you are actually getting the "LPX-00601: Invalid token" error you claimed. So I'll base this on that assumption, and on a string without that typo.

extract is deprecated; but even so, to use it here (with corrected raw XML) you need to specify the namespace with the optional third argument:

select extract(xmltype.createxml(t.xml_data),
  '//defaultmpftest:defaultmpftest/@test_id',
  'xmlns:defaultmpftest="http://test.com"').getStringVal()
from t_testxml t;

EXTRACT(XMLTYPE.CREATEXML(T.XML_DATA),'//DEFAULTMPFTEST:DEFAULTMPFTEST/@TEST_ID','XMLNS:DEFAULTMPFTEST="HTTP://TEST.COM"').GETSTRINGVAL()
-----------------------------------------------------------------------------------------------------------------------------------------
1231

Rather than using the deprecated function, you could use XMLQuery:

select xmlquery(
  'declare namespace defaultmpftest="http://test.com"; (: :)
   //defaultmpftest:defaultmpftest/@test_id'
  passing xmltype.createxml(t.xml_data)
  returning content).getStringVal()
from t_testxml t;

XMLQUERY('DECLARENAMESPACEDEFAULTMPFTEST="HTTP://TEST.COM";(::)//DEFAULTMPFTEST:DEFAULTMPFTEST/@TEST_ID'PASSINGXMLTYPE.CREATEXML(T.XML_DATA)RETURNINGCONTENT).GETSTRINGVAL()
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1231

You would need two XMLQuery clauses to get both values. I'd usually use XMLTable instead, shown here with the (fixed) XML-as-string provided via a CTE:

with t_testxml(xml_data) as (select '<?xml version="1.0" encoding="UTF-8"?>
<defaultmpftest:defaultmpftest xmlns:defaultmpftest="http://test.com"
 test_id="1231"
 test_name="name_test">
<mpftestdata:additionalLinkUrl xmlns:mpftestdata="http://test2.com"/>
</defaultmpftest:defaultmpftest>' from dual
)
select x.test_id, x.test_name
from t_testxml t
cross join xmltable(
  xmlnamespaces('http://test.com' as "defaultmpftest"),
  '//defaultmpftest:defaultmpftest'
  passing xmltype(t.xml_data)
  columns test_id number path '@test_id',
    test_name varchar2(30) path '@test_name'
) x;

   TEST_ID TEST_NAME                     
---------- ------------------------------
      1231 name_test                     

Read more about using these functions.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! Very useful! It helped.
1

The XML is not well-formed, try

<mpftestdata:additionalLinkUrl xmlns:mpftestdata="http://test2.com" />

or

<mpftestdata:additionalLinkUrl xmlns:mpftestdata="http://test2.com"></mpftestdata>

2 Comments

Thank you for the answer, but I can't change the XML. And I need information from the upper node. defaultmpftest:defaultmpftest
Provided data is not a XML. Thus you cannot use any XML function. You have to use REGEXP functions in this case.
1

As Wernfried Domscheit correctly stated: Your XML is not well-formed. And for non-well-formed XMLs there's no way to extract information thereof in regular ways. Simply because regular ways are for XML; and your "XML" is not really an XML.

Let's try non-regular ways then...

with t_testxml as (
    select q'{<?xml version="1.0" encoding="UTF-8"?>
<defaultmpftest:defaultmpftest xmlns:defaultmpftest="http://test.com"
 test_id = "1231"
 test_name = "name_test">
</mpftestdata:additionalLinkUrl xmlns:mpftestdata="http://test2.com"/>
</defaultmpftest:defaultmpftest>}' as xml_data
    from dual
)
select xmlcast(xmlparse(content regexp_substr(T.xml_data, '<defaultmpftest:defaultmpftest[^>]+')||' />').extract('*/@test_id') as integer) as test_id
from t_testxml T
;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.