1

I'm scraping the following html table:

<table>
 <tr>
  <td class="Name">A</td>
  <td class="S1">5</td>
  <td class="S2">6</td>
 </tr>
</table>

My goal is to use Html_simple_dom in order to parse the data and input the values into a MySQL database. Here's what I have so far:

<?php
include('../simple_html_dom.php');
include('dbconnect.php');
$html = file_get_html('url');
$table = $html->find('table');
foreach ($table->find('tr') as $row) {
 foreach ($row->find('td[class=Name]') as $cell) {
  $name = $cell->plaintext;
  }
}

The issue I'm running into is that my $name variable is actually an array. I'm getting stuck with duplicates if I do this instead:

foreach ($table->find('tr') as $row) {
 foreach ($row->find('td[class=Name]') as $cell) {
  }
  $name = $cell->plaintext;
}

My ultimate goal would be a MySQL query such as this:

$sql = Insert into ScoreTable (Score1, Score2)
       Values ($S1, $S2)
       Where PName = $Name

However I can't separate the array values I'm getting when I "find" and I can't even isolate the html elements into variables. Where am I going wrong?

edit: Fixed what my goal is.

9
  • It would be helpful to see a sample of the html table. Absent that, what I see is that you're only going to get the last row because $name is not an array, and it is being reassigned every time through the nested foreach loops. Commented Feb 12, 2018 at 4:45
  • 1
    I would expect you could build your own data structure like this: foreach $table->find('tr') as $row) { $data = array('name' => $row>find('td[class=Name]')->plaintext, 'S1' => $row>find('td[class=S1]')->plaintext, 'S2 => $row>find('td[class=Name]')->plaintext); } assuming that your syntax for finding the classname is correct. I haven't used html_simple_dom, but I would expect it works this way. Commented Feb 12, 2018 at 4:55
  • 1
    You could either iterate through the array and make an insert each time, or you could build the insert statement before executing it. Commented Feb 12, 2018 at 5:02
  • 1
    and, I copied and pasted without changing class=Name to class=S2. Too close to bedtime... Commented Feb 12, 2018 at 5:06
  • 1
    That means the find() method didn't find what it was looking for, so when we asked for the plaintext property, there was no object (what it should have found) for it to get the property from. This would happen if you have trs without the classes you're looking for. My bad, I shouldn't have chained it together. You'll have to test to see if the find() method actually is successful before trying to get the plaintext. Commented Feb 12, 2018 at 5:26

2 Answers 2

1

You don't need to use cicly if need only one value. You can get first element of returned array or use second params of find()

See here http://simplehtmldom.sourceforge.net/manual.htm

// Find (N)th anchor, returns element object or null if not found (zero based)

$ret = $html->find('a', 0);

You MYSQL insert has wrong format see https://dev.mysql.com/doc/refman/5.7/en/insert.html Correct is

INSERT INTO ScoreTable (Score1, Score2, Pname) VALUES ('$S1', '$S2','$name')

And I don't know that's in you "dbconnect.php". But if there is something like "$mysqli = mysqli_connect" your code is

foreach ($table->find('tr') as $row) {
   $name =$row->find('td.Name',0)->plaintext;
   $S1 =$row->find('td.S1',0)->plaintext;
   $S2 =$row->find('td.S2',0)->plaintext;
   if (!is_null($name)) { // if found name
          $name=$mysqli->real_escape_string($name); // Escapes special characters
          $S1=$mysqli->real_escape_string($S1);
          $S2=$mysqli->real_escape_string($S2);

         if ($mysqli->query("INSERT INTO ScoreTable (Score1, Score2, Pname) VALUES ('$S1', '$S2','$name')") === TRUE) {//Make SQL query and check is it success
                echo "Sccess\n";
         }
   }

}

Check your dbconnect.php and past needed link-to-connection variable instead $mysqli

Also it's strognly recommend to use real_escape_string() to escapes special characters. Especially if you work with outside data.

Sign up to request clarification or add additional context in comments.

Comments

0

I don't know if this is the best answer, but it's working:

HTML Table scraped from a website:

<table>
 <tr>
  <td class="Name">A</td>
  <td class="S1">5</td>
  <td class="S2">6</td>
 </td>
</table>

I ended up creating an array for each class I was scraping, combining it all into a single array, then updating the database from the merged array:

<?php
include("connect.php");
include("simple_html_dom.php");
$html = file_get_html('url', TRUE);

$table = $html->find('table',0);
foreach($table->find('tr') as $row){

$g_name = array()
foreach($row->find('td[class=Name]') as $cell{
 $g_name['Name'] = $cell->plaintext;
}

$g_s1 = array()
foreach($row->find('td[class=S1]') as $cell{
 $g_s1['S1'] = $cell->plaintext;
}

$g_s2 = array()
foreach($row->find('td[class=S2]') as $cell{
 $g_s2['S2'] = $cell->plaintext;
}

$data = array_merge($g_name,$g_s1,$g_s2);
$sql = "Update table SET
Rd1='".$data['S1']."',
Rd2='".$data['S2']."',
WHERE Player = '".$data['Name']."';

}
?>

I then close my MySQL connection or display errors etc... I'm not sure if it's the best way to do it, but it's working.

Thanks all for your help.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.