get table data with curl and regex

Question

this is my code to extract data from table.

but I want delete links.

and how pieces title and price to array.

<?php

$ch = curl_init ("http://www.digionline.ir/Allprovince/CategoryProducts/cat=10301");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$page = curl_exec($ch);

preg_match('#<table[^>]*>(.+?)</table>#is', $page, $matches);
foreach ($matches as &$match) {
$match = $match;
}
echo '<table>';

echo  $match ;
echo '</table>';

?>

with top procedure can't extract in to array // I want extract data in to array — amir rasabeh
– amir rasabeh, Commented Sep 2, 2014 at 8:11

Kevin · Accepted Answer · 2014-09-02 08:27:46Z

3

I suggest use an HTML Parser instead. Use DOMDocument + DOMXpath, no need to install they are built-in already. Example:

$ch = curl_init ("http://www.digionline.ir/Allprovince/CategoryProducts/cat=10301");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$page = curl_exec($ch);

$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($page);
libxml_clear_errors();
$xpath = new DOMXpath($dom);

$data = array();
// get all table rows and rows which are not headers
$table_rows = $xpath->query('//table[@id="tbl-all-product-view"]/tr[@class!="rowH"]');
foreach($table_rows as $row => $tr) {
    foreach($tr->childNodes as $td) {
        $data[$row][] = preg_replace('~[\r\n]+~', '', trim($td->nodeValue));
    }
    $data[$row] = array_values(array_filter($data[$row]));
}

echo '<pre>';
print_r($data);

$data should contain:

Array
(
    [0] => Array
    (
        [0] => AMDA4-3400
        [1] => 1,200,000
        [2] => 1,200,000
    )

    [1] => Array
    (
        [0] => AMDSempron 145
        [1] => 860,000
        [2] => 910,000
    )

answered Sep 2, 2014 at 8:27

Kevin

41.9k12 gold badges57 silver badges72 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

amir rasabeh Over a year ago

How to update page every day with curl of course I Guess whit corn job can be update page .. but I don't know how work with that

amir rasabeh Over a year ago

please see this question stackoverflow.com/questions/25696986/…

Peacefull Over a year ago

@Ghost thank you for this code but can you help me please with other structure of table ? Here is what i need : example. Regards

Kevin Over a year ago

@Peacefull the idea is just the same, use curl to get the html, use DOM to parse the actual HTML

Peacefull Over a year ago

@Ghost yes i followed your actual code and works great to parse the HTML table but i can't manage it to return the title associate to the values. Can you give me an example please ?

|

Dmitriy.Net · Accepted Answer · 2014-09-02 07:44:37Z

0

If you want parse some web resource, you can use PHP Simple HTML DOM Parser

If you want to get an table and all links inside table:

$html = file_get_html('http://www.digionline.ir/Allprovince/CategoryProducts/cat=10301');
$table = $html->find('table');
$links = $table->find('a');

echo $table;

answered Sep 2, 2014 at 7:44

Dmitriy.Net

1,52013 silver badges26 bronze badges

Collectives™ on Stack Overflow

get table data with curl and regex

2 Answers 2

6 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related