0

In my xml example

<sample>
    <para>This text is sample paragraph with <url>https://www.google.com</url>. Thank you!</para>
</sample>

I would like to extract the following sentence from xml using python3.

This text is sample paragraph with https://www.google.com/. Thank you!

So I used Python 3 code as below.

# Sample.py
# root = xml root, xml = xml namespace
description = root.findall('./{0}sample/{0}para'.format(namespace))

for i in description:
    print(i.text)

But, Sample.py code above was very different from the output I wanted.

# Sample.py Output
This text is sample paragraph with

How can I print the text that includes the value in the url tag I want? (*It excludes simply using findall() func and attaching url(i.e., the location of url tags is unknown).)

2 Answers 2

2

Below (no need to use an external library)

import xml.etree.ElementTree as ET

xml1 = '''<sample>
    <para>This text is sample paragraph with <url>https://www.google.com</url>. Thank you!</para>
</sample>'''

root1 = ET.fromstring(xml1)
para = root1.find('.//para')
print('{} {} {}'.format(para.text, list(para)[0].text, list(para)[0].tail))

output

This text is sample paragraph with  https://www.google.com . Thank you!
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. It works well. However, this method requires an exception if there are more than one url tag in the middle or not using url tag. So I just decided to use bs4 like the above method.Thanks again for your time and consideration.
1

Using BeautifulSoup this works

soup.find("para").text
>>> 'This text is sample paragraph with https://www.google.com. Thank you!'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.