1

Here is a sample of some XML code I have:

<VAST version="2.0">
<Ad id="602678">
<InLine>
<AdSystem>Acudeo Compatible</AdSystem>
<AdTitle>NonLinear Test Campaign 1</AdTitle>
<Description>NonLinear Test Campaign 1</Description>
<Creatives>
<Creative AdID="602678-NonLinear">
</Creative>
</Creatives>
</InLine>
</Ad>
</VAST>

This XML is served online so I hit a specific URL in order to get this data back. However, in some cases, nothing is returned so I am looking for a way to verify whether or not the "Creatives" node is present in whatever is returned at any given time. I have tried BeautifulSoup with no luck but I think that is more for HTML rather than XML. Any help is greatly appreciated, thanks.

2
  • What's wrong with plain-text checking? i.e. if "<Creatives>" in your_returned_payload: ... Commented Jun 7, 2017 at 2:18
  • I think the trouble I am having is with parsing the XML data itself after trying to pull it from the web URL I am hitting. I will update by POST with my code. Commented Jun 7, 2017 at 2:23

2 Answers 2

1

Let's suppose you retrieve the XML from an URL like this:

import requests

r = requests.get(url)
if r.status_code == 200:
    xml_tag_exists(r)

Then you just have to build a simple function which will return True/False based on whether the required XML tag exists:

def xml_tag_exists(r):
    return '<Creatives>' in r.text

For example, let's take the following URL:

>>> import requests
>>> url = 'http://www.w3schools.com/xml/plant_catalog.xml'
>>> r = requests.get(url)
>>> if r.status_code == 200:
...     print(r.text)

The above will return an XML of the following form:

<CATALOG>
  <PLANT>
    <COMMON>Bloodroot</COMMON>
    <BOTANICAL>Sanguinaria canadensis</BOTANICAL>
    <ZONE>4</ZONE>
    <LIGHT>Mostly Shady</LIGHT>
    <PRICE>$2.44</PRICE>
    <AVAILABILITY>031599</AVAILABILITY>
  </PLANT>
  <PLANT>
    <COMMON>Columbine</COMMON>
    <BOTANICAL>Aquilegia canadensis</BOTANICAL>
    <ZONE>3</ZONE>
    <LIGHT>Mostly Shady</LIGHT>
    <PRICE>$9.37</PRICE>
    <AVAILABILITY>030699</AVAILABILITY>
  </PLANT>
  ...
</CATALOG>

Which if we check for some tag:

>>> if '<CATALOG>' in r.text:
...     print(True)
...
True

So, if I were to do this, I'd write something like this:

import requests


def xml_tag_exists(r):
    return '<Creatives>' in r.text


def main():
    r = requests.get('your_url_goes_here')
    if r.status_code == 200:
        xml_tag_exists(r)

if __name__ == '__main__':
    main()
Sign up to request clarification or add additional context in comments.

2 Comments

That will work only if the returned content is only "<Creatives>" and nothing else.
@zwer it was just a typo. It was meant to be "<Creatives>" in r.text. Thanks for head-ups. Fixed now
0

You can also use XPath

from lxml import etree

f = StringIO(YOURS_XML)
tree = etree.parse(f)

creatives_node = tree.xpath('/Creatives')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.