0

I am very new to coding in PHP/MySQL, for fun I've been downloading some World of Worldcraft data from their API to test what I've learnt about PHP.

Sorry for the long post, I'm trying to give as much info as needed, whilst being concise :)

The short version is "my code takes 25 minutes to run. Can I do anything (things) better to speed this up?

LONG VERSION BELOW

Blizzard have a lot of data held about their servers, in a JSON format & there's a JSON file per server (I've a list 123 servers in total). I've written a basic JSON file that holds the name & URL of the 123 servers (serv_gb.json).

The code should run as follows:-

  • Read serv_gb.json to count the number of rows to use in a For loop.
  • Record a start time for my own check.
  • The For loop starts, and it reads & accesses the URL's in serv_gb.json.
  • The URL holds a JSON file for that server, it's parsed and held in an array. The URL when opened in a web browser takes on average 20 seconds to load & holds approx 100k rows.
  • It then uses the foreach to read & send the data to my database.
  • The end time is recorded & a duration calculated.

There is more data in the JSON file than I need. I've tried two ways to filter this.

a) Just try to send all of the data in the JSON & if it fails to & would normally error, hide it with "error_reporting(0);". (Lazy, yes, because this is a pet project just for me, I thought I'd try it!).

b) Adding the following If loop around the SQL to only send the correct data to the database.

foreach($auctions as $val)

if ($val['item'] == 82800) {
    {
        $sql = "INSERT INTO `petshop`.`ah` (datedatacollected, seller, petSpeciesId, server, buyout, region, realrealm) VALUES('".$cTime."','".$val['owner']."','".$val['petSpeciesId']."', '".$servername."', '".$val['buyout']."','gb','".$val['ownerRealm']."')";
        $mysqli->query($sql);
    }
}

Idea A is about 15% slower than Idea B. Idea B takes the 25 minutes to run through all 123 JSON's.

I run this on a local machine (my own desktop pc) running Apache2.4, MySQL57 & PHP Version 5.6.32

Any advice is appreciated, I think the following are good questions that I need help learning more about.

  • Is there any ways to making the loops or reads faster (or am I at the mercy of Blizzard's servers)?
  • Is it the writing to the DB that's bottle necking the process?
  • Would splitting the URLs up & triggering multiple JSON reads at the same time speed this up?

If you've got this far in the post. A huge THANK YOU for taking the time to read it, I hope it was clear and made sense!

MY PHP CODE

$servjson = '../servers/serv_gb.json';
$servcontents = file_get_contents($servjson);
$servdata = json_decode($servcontents,true);

$jrc = count($servdata['servers']);

echo "Json Row Count: ".$jrc."<p>";

$sTime = (new \DateTime())->format('Y-m-d H:i:s');

for($i=0; $i<$jrc; $i++){

$json = $servdata['servers'][$i]['url'];
$contents = file_get_contents($json); 
$data = json_decode($contents,true);

$servername = $data['realms'][0]['slug'];

$auctions = $data['auctions'];
foreach($auctions as $val)


    {
        $sql = "INSERT INTO `petshop`.`ah` (datedatacollected, seller, petSpeciesId, server, buyout, region, realrealm) VALUES('".$cTime."','".$val['owner']."','".$val['petSpeciesId']."', '".$servername."', '".$val['buyout']."','gb','".$val['ownerRealm']."')";
        $mysqli->query($sql);
    }

}

$eTime = (new \DateTime())->format('Y-m-d H:i:s');
$dTime = abs(strtotime($eTime) - strtotime($sTime));
2
  • Creating a work queue would be better as you be splitting up the work between multiple instances, but you could start off by using curl multi, prepared queries and move data lookup into generators. Appart from that 123*100k rows will take a while. Commented Mar 22, 2018 at 12:30
  • I love the ideas, thank you! I'm not at all familiar with any of those things you've suggested, but with the power of search engines I have faith that I will learn new & helpful things. Commented Mar 23, 2018 at 18:54

2 Answers 2

1

There are at least two topics here:

A) you are downloading feeds from remote servers to your local machine. This may take some time of course. Compressing content by the server should be used if possible. This limits the amount of data which needs to be transfered to the client. However this is not straightforward with built-in PHP functions. You need to take a look at PHP cURL: curl_setopt with CURLOPT_ENCODING. If remote server can compress data, request it to do so.

B) database performance. As far as MySQL you can choose MyISAM or InnoDB engine. Inserting thousands of records could work better with InnoDB and a transaction, which you commit as soon as all queries are successful. Tuning your PHP and MySQL is a serious topic and it cannot be fully covered here.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for the reply! I'll look at using cURL and report back. I am using an InnoDB engine for the DB.
0

Please echo a sample INSERT statement. There may be some issues with it.

Around the INSERT loop, add BEGIN and COMMIT. This will put all the inserts into a single transaction, thereby making it faster.

I prefer microtime(true) for measuring time.

25 minutes? How long does it take if you comment out the INSERTs? This is to see whether it is MySQL or not.

4 Comments

Thank you! INSERT INTO petshop.ah (datedatacollected, seller, petSpeciesId, server, buyout, region, realrealm) VALUES('2018-03-23 18:45:46','Ilmago','55', 'nemesis', '1290099','it','Nemesis') is the INSERT statement. As for the Begin & Commit, I'm very new to this, I tried using $mysqli->begin_transaction(MYSQLI_TRANS_START_READ_ONLY); & $mysqli->commit();, the code ran but my DB didn't populate. I need to read more about this & learn the correct usage of it.
The timings of the code are as follows. Commented out insert Duration = 903 seconds & Full run Duration = 2151 seconds
Well, "READ_ONLY" says that you won't be writing (INSERTing).
Oh! Oops, I thought I would read the data & the commit would still send to the DB. Thank you! To be honest, I'm struggling to get the Begin/Commit to work. Would you be able to give me an example or a link where I can read up on it please?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.