Find and replace strings in HTML

Question

From this HTML code:

<p class="description" dir="ltr">Name is a fine man. <br></p>

I'm looking for replacing "Name" using the following code:

target = soup.find_all(text="Name")
for v in target:
    v.replace_with('Id')

The output I would like to have is:

<p class="description" dir="ltr">Id is a fine man. <br></p>

When I:

print target
[]

Why doesn't it find the "Name"?

Thanks!

try using str.replace(), Are you sure you have some text that matches it? — sinhayash
– sinhayash, Commented Jul 4, 2015 at 11:41
You have no Element with Text Name. Show your HTML, and what you intend to do. — Daniel
– Daniel, Commented Jul 4, 2015 at 11:50
Not really that clear what you're asking. Firstly did your soup find any text='Name'? Also is there any code in between replace_with and print target? — Paul Rooney
– Paul Rooney, Commented Jul 4, 2015 at 11:51
@PaulRooney, apparently my soup did not find text='Name' and there is no code in between, — Diego
– Diego, Commented Jul 4, 2015 at 12:03

har07 · Accepted Answer · 2015-07-04 12:02:50Z

8

The text node in your HTML contains some other text besides "Name". In this case, you need to relax search criteria to use contains instead of exact match, for example, by using regex. Then you can replace matched text nodes with the original text except for "Name" part should be replaced with "Id" by using simple string.replace() method, for example :

from bs4 import BeautifulSoup
import re

html = """<p class="description" dir="ltr">Name is a fine man. <br></p>"""
soup = BeautifulSoup(html)
target = soup.find_all(text=re.compile(r'Name'))
for v in target:
    v.replace_with(v.replace('Name','Id'))
print soup

output :

<html><body><p class="description" dir="ltr">Id is a fine man. <br/></p></body></html>

answered Jul 4, 2015 at 12:02

har07

89.5k12 gold badges87 silver badges143 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Community · Accepted Answer · 2017-05-23 12:14:07Z

1

It returns an empty list because searching for text like this must match the whole text in a tag, so use regular expression instead.

From the official docs: BeautifulSoup - Search text

text is an argument that lets you search for NavigableString objects instead of Tags. Its value can be a string, a regular expression, a list or dictionary, True or None, or a callable that takes a NavigableString object as its argument:

soup.findAll(text="one")
# [u'one']
soup.findAll(t ext=re.compile("paragraph"))
# [u'This is paragraph ', u'This is paragraph ']
soup.findAll(text=lambda(x): len(x) < 12)
# [u'Page title', u'one', u'.', u'two', u'.']

P.S.: Already already discussed answers are here and here.

edited May 23, 2017 at 12:14

CommunityBot

11 silver badge

answered Jul 4, 2015 at 12:38

devautor

2,6064 gold badges24 silver badges33 bronze badges

Collectives™ on Stack Overflow

Find and replace strings in HTML

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related