2

I am now developing module that generate data as following structure into json from mysql.

 {"employee":
   [
      {"id":1,
       "name":"jhon doe",
       "register_date":"2011-05-11",
       "education":
             [ 
                {"degree":"B.A","description":"History"},
                {"degree":"M.A","description":"History"}
             ]
       },
       {"id":2,
       "name":"Smith",
       "register_date":"2011-06-11",
       "education":
             [ 
                {"degree":"B.E","description":"Mechnical"},
                {"degree":"M.E","description":"Mechnical"}
             ]
       }
   ]
 }

To achieve this, firstly, I retrieve employee data by register_date range as follow.

 $result=mysql_query("SELECT * FROM employee WHERE register_date>'2011-04-31' AND register_date<'2011-07-01'");

Then I iterate each row from result and retrieve education information from education table as follow:

 while($row = mysql_fetch_assoc($result)){
   $id=$row["id"];
   $education=mysql_query("SELECT degree,description from education where emp_id=$id");
   // assigning education array as educaiton field in $row
   // write json_encode($row) to output buffer
} 

This project's data structure is not my design and I know it's not a good idea setting employee'id as foreign key in education table, instead, It should be set education id as foreign key in employee table. My problem is that (by using this data structure) retrieving education list for each row of employee is huge performance issue because there may be about 500000 record of employee for a month and for that amount, mysql select queries have to be processed 500000 times for education data retrieval for each employee. how should I optimize.

1.Should I change data structure?
2. Should I create mysql stored procedure that generate json string directly from database?
3. Should I denormalize education data in employee table?
which is most efficient approach or any suggestion?

Please help me.

3
  • 1
    en.wikipedia.org/wiki/Join_(SQL) :P You do one query, joining the tables properly, and you can get all the data at once. (Oh, and i'm pretty much obligated to mock you for still using mysql_query in 2015.) Commented Jan 9, 2015 at 4:27
  • I think this is not the bottleneck of your perfomance issues (if you have any). Quering a single row by its primary key is fast enough (if you have correcty configured mysql server caching params). The best you can do is to add another layer of caching between your php code and mysql using something like PHP-ORM libraries. Commented Jan 9, 2015 at 5:28
  • @Olim: It's certainly a bottleneck. One query for one row by primary key, sure, that's pretty fast. Half a million queries, though? Not so much. The network/parse/execute times add up, caching could actually be a hindrance (since the queries are all asking for different data, every query is likely to result in cache misses), ORM just adds unnecessary object construction and N+1-query dangers to the pile, and the strictly serial handling of each query leads to lots of dead air. I can easily picture a 10-100x speedup here. Commented Jan 9, 2015 at 13:58

3 Answers 3

1

You can query education data after fetching employees and assigning them in php loop.

$result = mysql_query("select * from employee where register_date > '2011-04-31' and register_date < '2011-07-01'");

$employees = [];

while ($row = mysql_fetch_assoc($result)) {
    $id = $row['id'];
    $employees[$id] = $row;
}

$ids = join(', ', array_keys($employees));

$result = mysql_query(sprintf('select degree, description, employee.id as employee_id from education left join employee on emp_id = employee.id where employee.id in (%s)', $ids));

while ($row = mysql_fetch_assoc($result)) {
    $id = $row['employee_id'];
    unset($row['employee_id']);

    if (!isset($employees[$id]['education'])) {
        $employees[$id]['education'] = [];
    }

    $employees[$id]['education'][] = $row;
}
Sign up to request clarification or add additional context in comments.

Comments

0

As pointed out in the comments, you could probably just use some sort of JOIN.

Other than that I'd like introduce you to the following quick & dirty solution. This might use a lot of ram.

$arr_education = array();
$res_education = mysql_query("SELECT emp_id, degree, description FROM education");

while($row_education = mysql_fetch_assoc($res_education)) {
    $arr_education[$row_education["emp_id"]] = array("degree"=>$row_education["degree"], "description" => $row_education["description"]);
}

$result=mysql_query("SELECT * FROM employee WHERE register_date>'2011-04-31' AND register_date<'2011-07-01'");

while($row=mysql_fetch_assoc($result)) {
    //$arr_education[$row["id"]] is an array containing the education-entries for the employee
    $tmp = $row;
    $tmp["education"] = $arr_education[$row["id"]];
    echo json_encode($tmp);
}

To your specific questions:

  1. If emp -> education is a 1 to 1 relation you should change the data structure.

  2. Don't do that

  3. Don't do that

Consider switching to PDO! http://php.net/manual/en/book.pdo.php It won't help you in this specific case but you're should never use mysql_* functions.

Comments

0

Better you can use the SP's for processing the data and then to return JSON string. Let the DB system process and iterate. I think it will improve a lot compared with the current implementation.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.