1

i want to find class ft00 between Work Experience and EDUCATION AND TRAINING and extract class text which contains dates from the given html

<p class = "ft00">Introduction</p>
<p class = "ft00">John Smith</p>
<p class = "ft02">Email:</p>
<p class = "ft00">[email protected]</p>
<p class = "ft00">Work Experience</p>
<p class = "ft00">27 July 2017</p>
<p class = "ft02">ABC Company</p>
<p class = "ft00">19 May 2018</p>
<p class ="ft02">XYZ Company</p>
<p class = "ft00">EDUCATION AND TRAINING</p>

so far i could get is to extract all data between Work Experience and EDUCATION AND TRAINING and it's working properly and the code is given below:-

$fexp = $html->find('p[plaintext^=Work Experience]');
$items = array();
 foreach ($fexp as $keye) {

    while ( $keye->nextSibling() ) {
        if ( $keye->nextSibling() == TRUE ) {

         $keye = $keye->nextSibling();
            $varce = $keye->plaintext;



        }
        if ( trim($varce) == "EDUCATION AND TRAINING" ){
            break;
        }
        //$test[] = $collection;
       $items[] = $varce;
        // echo $varce;

}
}
var_dump($items);

i am close but can't seem to find out the solution, any help would be appreciated thanks :-)

2
  • Be clear with your question Commented May 5, 2018 at 15:27
  • @PreciousTom which part you didn't get? Commented May 5, 2018 at 16:27

2 Answers 2

3

With DOMDocument and DOMXPath you could do it like the following, I've never used Simple HTML DOM Parser but I'm presuming it has XPath.

<?php
$dom = new DOMDocument();

$dom->loadHtml('
<p class = "ft00">Introduction</p>
<p class = "ft00">John Smith</p>
<p class = "ft02">Email:</p>
<p class = "ft00">[email protected]</p>
<p class = "ft00">Work Experience</p>
<p class = "ft00">27 July 2017</p>
<p class = "ft02">ABC Company</p>
<p class = "ft00">19 May 2018</p>
<p class ="ft02">XYZ Company</p>
<p class = "ft00">EDUCATION AND TRAINING</p>
', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$xpath = new DOMXPath($dom);

$result = [];
$matching  = false;
foreach ($xpath->query("//p[contains(@class, 'ft00') or contains(@class, 'ft02')]/text()") as $p) {
    if ($p->nodeValue === 'Work Experience' || $matching) {
        $result[] = $p->nodeValue;
        $matching = true;
    }
    if ($p->nodeValue === 'EDUCATION AND TRAINING') {
        break;
    }
}

print_r($result);

Result:

Array
(
    [0] => Work Experience
    [1] => 27 July 2017
    [2] => ABC Company
    [3] => 19 May 2018
    [4] => XYZ Company
    [5] => EDUCATION AND TRAINING
)

https://3v4l.org/0nvr4

Sign up to request clarification or add additional context in comments.

1 Comment

thanks i used the same logic as u did and implement it using simple html dom and it works fine.
1

Here is the proper working code:-

$test = array();
$matching  = false;
$collection = $html->find('p.ft00');
foreach ($collection as $tkey) {
    if ($tkey->plaintext == "WORK EXPERIENCE" || $matching ) {
        $test[] = $tkey->plaintext;
        $matching = true;
    }
    if ( $tkey->plaintext == "EDUCATION AND TRAINING") {
        break;
    }

    }
    var_dump($test);    

Output:-

Array
(
    [0] => Work Experience
    [1] => 27 July 2017
    [2] => 19 May 2018
    [3] => EDUCATION AND TRAINING
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.