1

I have a problem with generating XML. I used Simple Transformation. Many of tags in my XML are empty. I found an information that I can get rid of those tags using Regex but it doesn't work perfectly. Let me show you how it looks.

Without Regex:

 <?xml version="1.0" encoding="utf-8" ?> 
<Invoice 
xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" 
xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" 
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" 
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
     <cbc:DueDate /> 
     <cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode> 
     <cbc:Note /> 
     <cbc:DocumentCurrencyCode>PLN</cbc:DocumentCurrencyCode> 
     <cbc:TaxCurrencyCode /> 
     <cbc:BuyerReference /> 
     <cac:InvoicePeriod>
      <cbc:StartDate /> 
      <cbc:EndDate /> 
      <cbc:DescriptionCode /> 
     </cac:InvoicePeriod>

Regex written in ABAP:

      REPLACE ALL OCCURRENCES OF REGEX
    '(<!\[CDATA\[([^]]|(\][^]])|(\]\][^>]))*\]\]>)|(<([^?][^><\s]*)(\s[^><]+)?/>)'
      IN exportxml
      WITH '$1'.

After using Regex:

      <cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode> 
      <cbc:DocumentCurrencyCode>PLN</cbc:DocumentCurrencyCode> 
      <cac:InvoicePeriod />

SimpleTransformation looks like this:

<?sap.transform simple?>
<tt:transform xmlns:tt="http://www.sap.com/transformation-templates" xmlns:ddic="http://www.sap.com/abapxml/types/dictionary" xmlns:def="http://www.sap.com/abapxml/types/defined">
  <tt:root name="ZXT_INVOICE" type="ddic:ZXT_INVOICE"/>
  <tt:template>
    <Invoice
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
xmlns:ccts="urn:un:unece:uncefact:documentation:2" 
xmlns:qdt="urn:oasis:names:specification:ubl:schema:xsd:QualifiedDatatypes-2" xmlns:udt="urn:un:unece:uncefact:data:specification:UnqualifiedDataTypesSchemaModule:2" 
xmlns:xs="http://www.w3.org/2001/XMLSchema" 
xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2"
>
      <cbc:DueDate tt:value-ref=".ZXT_INVOICE.DUEDATE"/>
      <cbc:InvoiceTypeCode tt:value-ref=".ZXT_INVOICE.INVOICETYPECODE"/>
      <cbc:Note tt:value-ref=".ZXT_INVOICE.NOTE"/>
      <cbc:DocumentCurrencyCode tt:value-ref=".ZXT_INVOICE.DOCUMENTCURRENCYCODE"/>
      <cbc:TaxCurrencyCode tt:value-ref=".ZXT_INVOICE.TAXCURRENCYCODE"/>
      <cbc:AccountingCost tt:value-ref=".ZXT_INVOICE.ACCOUNTINGCOST"/>
      <cbc:BuyerReference tt:value-ref=".ZXT_INVOICE.BUYERREFERENCE"/>
      <cac:InvoicePeriod>
        <cbc:StartDate tt:value-ref=".ZXT_INVOICE.INVOICE_PERIOD.STARTDATE"/>
        <cbc:EndDate tt:value-ref=".ZXT_INVOICE.INVOICE_PERIOD.ENDDATE"/>
        <cbc:DescriptionCode tt:value-ref=".ZXT_INVOICE.INVOICE_PERIOD.DESCRIPTIONCODE"/>
      </cac:InvoicePeriod>
    </Invoice>
  </tt:template>
</tt:transform>

Regex removes simple elements, but has a problem with nested elements like <cac:InvoicePeriod>. In my program I have many nested elements.. Can you help me modify regex or find another solution?

Thanks for any help.

11
  • seems like running the same regex twice will also remove the InvoicePeriod? Commented Sep 2, 2020 at 11:12
  • 2
    Does this answer your question? RegEx match open tags except XHTML self-contained tags Commented Sep 2, 2020 at 13:58
  • 2
    No need of regex, there's a built-in option when you call the Simple Transformation: CALL TRANSFORMATION ... OPTIONS initial_components = 'suppress'. Commented Sep 2, 2020 at 14:11
  • 2
    Regex is not a good idea for HTML or XML processing. Use @SandraRossi's hint or use an XSLT transformation to achieve what you want (Sandra's solution is much simpler though). Commented Sep 2, 2020 at 18:41
  • 1
    Hmm you're right, it's not sufficient because the option operates only on the intermediate transformation of the data object to the SAP format ASXML (the XML input of the transformation), and your Simple Transformation always generates the mentioned elements. Please forget this option. Another solution is to change the code of the transformation to not generate the element if it's empty. Commented Sep 3, 2020 at 11:52

2 Answers 2

1

Your ABAP regex literal :

(<!\[CDATA\[([^]]|(\][^]])|(\]\][^>]))*\]\]>)|(<([^?][^><\s]*)(\s[^><]+)?/>)

could be corrected and simplified this way :

(<!\[CDATA\[(?!\]\]>).*\]\]>)|<[^?!](?:(?!>|\/>).)*\/>

NB: (?!xyz). is a Negated preview condition, it means any character (.) provided that it's not a x followed with yz.

Sign up to request clarification or add additional context in comments.

Comments

1

Remove empty xml elements recursively with XSLT Solution:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="*[descendant::text() or descendant-or-self::*/@*[string()]]">
    <xsl:copy>
        <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="@*[string()]">
    <xsl:copy/>
</xsl:template>

</xsl:stylesheet>

Ref links: 1 2

For me works perfectly. Thanks for help.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.