1

I have the following function:

function get_string_between($string, $start, $end){
    $string = " ".$string;
    $ini = strpos($string,$start);
    if ($ini == 0) return "";
    $ini += strlen($start);
    $len = strpos($string,$end,$ini) - $ini;
    return substr($string,$ini,$len);
}  

I passing in the following information over to this function:

$result = scraped HTML page;
$name = get_string_between($result, '<div class="model ww"> ',' </div>');
$name= strtok($name, "\n");

I expect the following results:

$name = 'XM1014 | Bone Machine (Well-Worn)';

The whole section is as follows:

<div class="modal ww"> XM1014 | Bone Machine (Well-Worn) </div>
3
  • 1
    You really shouldn't parse HTML like this. Use a PHP DOM parser instead. Commented Aug 13, 2015 at 19:12
  • I'm scraping a page protected by cloudflare. Python script to bypass this which return the page content to php. Correct me if I'm wrong however I don't believe I can use a PHP DOM parser in this case. Commented Aug 13, 2015 at 19:16
  • 1
    You're scraping a page and getting the DOM, you should use a DOM parser. Commented Aug 13, 2015 at 19:17

2 Answers 2

1

Try following code

    function get_string_between($string, $start, $end){
        $ar=array();
        $ar=explode($start,$string);
        $ar1=explode($end,$ar[0]);
        return implode("",$ar1);        
    }   
   $result = "<div class='modal ww'> XM1014 | Bone Machine (Well-Worn) </div>";
   $text_result=get_string_between($result, '<div class="model ww"> ',' </div>');
   print_r($text_result);

parse html with simple html dom. here is short example

 $html = new simple_html_dom();

// Load HTML from a string
$html->load('<html><body>
<div>test1</div>
<div>test2</div>
</body></html>');

foreach($html->find('div') as $element)
       print_r($element->plaintext);

for working above code you need to include this file So you can get content between all div. You can read more here

Finally your function will be

function get_string_between($string){
        $result=array();
        $html = new simple_html_dom();
        $html->load($string);

        foreach($html->find('div') as $element)
            array_push($result,$element->plaintext);    

        return $result;
 }   
       $result = "<div class='modal ww'> XM1014 | Bone Machine (Well-Worn) </div>";
       $text_result=get_string_between($result);
       print_r($text_result);

Hope it helps :)

Sign up to request clarification or add additional context in comments.

7 Comments

This works when: $result = "<div class='modal ww'> XM1014 | Bone Machine (Well-Worn) </div>"; However $result contains a whole html page, which doesn't seem to be working.
which are other possibilities requirements ? please tell me so that i can update code
$name returns the whole page when passed through your function. There are no other requirements, the string I'm looking for will always lie between <div class="modal ww"> </div> on the page.
Thanks I have checked that I have updated the answer now you can get content between <div>.
Possible to amend the first section to function correctly?
|
0

You can of course use a regexp, but there are libraries which will do the job quicker and easier. Try using a HTML dom parser, or phquery which is an implementation of jquery lib in php. It's also available for Composer. So if you're familiar with jquery syntax, you can find it very neat.

Get the html you need using this: $html = pq('.modal.ww')->html();

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.