PHP SimpleHTMLDomParser + Find lonely string

Question

I am using the SimpleHTMLDomParser to go through a html code and find various things. Everything works fine so far, but there is one problem:

How do I find a string that has no ID, no class and no unique parent element?

In my case, I started with extracting content from a div:

$descrs    = $html->find('.show_synopsis');

foreach($descrs as $descr) { 

    echo($descr->innertext);

}

This looks like:

<div class="show_synopsis">

    Lorem ipsum dolor sit amet, consetetur sadipscing elitr. <b>Source:</b> LORES.

</div>

Now, is it possible find and delete LORES from the above example?

Since LORES is a variable and can change, I was wondering if its possible to simply always find the word next to Source:?

I have tried a few different ways, but none worked so far. I have tried to adapt a solution from this post, but wasnt able to adjust them for my needs.

Juan · Accepted Answer · 2012-05-10 17:56:18Z

2

Try this:

echo preg_replace('/(.?)<b>.*Source:.*<\/b>.*\./', '$1', $descr->innertext);

edited May 10, 2012 at 17:56

answered May 10, 2012 at 17:41

Juan

1,4262 gold badges23 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

r0skar Over a year ago

This seems not to work. Although I get no error message, the  Source: LORES gets "echo´ed" as-well and is still visible in the output!

Juan Over a year ago

You first said Source: and then  Source:  (spaces before and after Source:); which one is the one you need the replace?

r0skar Over a year ago

Sorry for the confusion. It looks exactly like this: Random text. Source: CBS Random text.

Juan Over a year ago

I just edited my answer to cover both with and without spaces scenarios. Please try it again.

nmford · Accepted Answer · 2012-05-10 17:06:24Z

1

Can't you just replace the LORES in the string you are echo-ing?

echo str_replace('LORES', '', $descr->innertext);

answered May 10, 2012 at 17:06

nmford

1,0141 gold badge10 silver badges16 bronze badges

6 Comments

r0skar Over a year ago

Oh yeah :) That seems to work. One last thing: if I use this way and have various strings, would I simply create a for each loop and check each string or is there another (simpler) solution?

nmford Over a year ago

You can replace 'LORES' needle in the replace function with an array of multiple needles.

r0skar Over a year ago

Oh and if this matters: the html always looks like this: Source: LORES - so would i even be possible to always remove Source: and the word (LORES in our example) next to it?

r0skar Over a year ago

I have modified my question a bit, but overall your solution works fine - but it would be more comfortable if I could find a way without having to know all possible strings and puting them in an array.

nmford Over a year ago

In this case you could use preg_replace instead of str_replace.

|

Collectives™ on Stack Overflow

PHP SimpleHTMLDomParser + Find lonely string

2 Answers 2

4 Comments

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related