0

I have an array in php which is populated via XML. this array holds roughly 21000 items.

I am currently looping through the array, checking if the name node exists in the database (mysql) if it does update it, else insert new data and store the row id of the inserted/updated row, i then in the same loop insert more data into another table and link it to the first table: http://pastebin.com/iiyjkkuy

the array looks like this: http://pastebin.com/xcnHxeLk

Now due to the large amount of nodes in the array (21000) this is exceeding the 300 sec (5 mins) max execution time on my dev system..

What is the best way to loop through an array of this size and insert data?

just some more information on this. I am using expression engine 1.8.6 (work reasons) and i have to use its built in database class.

the reason for the select statements before each insert/update is to grab the row ID for future statements. The data has to be structured in the DB in a certain way, for example:

each source node has a papergroup node - this needs inserting / updating first then each paper name node needs to be linked to the paper group in the same table sourceid etc are then inserted into a sources table with a link to there parent paper in the papers table so basic db schema is this: papergroup inserted into papers table paper name inserted into papers table with papers.PID as the link to the papger group papers.ID sources are inserted into the sources table and linked to papers table on source.paperID

the basic structure of the XML source that populates the array is as follows:

<sources>
<source>
<sourceid>1</sourceid>
<papername>test</papername>
<papergroup>test group</papergroup>
<papertype>Standard</papertype>
<sourcename> test source</sourcename>
<sourcesize>page</sourcesize>
</source>
</sources>

the above is not a full segment but it shows the point about all the information being sent in one section. Hope this helps.

Ok I manage to get some timings. It takes 1:35:731 to get the XML it then takes between 0:0:025 and 0:0:700 to do an array loop (select, insert/update)

4
  • 1
    You might consider PDO and prepared statements. You seem to be serially calling the same query over and over again, something with which prepared statements may help. Commented Oct 1, 2011 at 14:41
  • I would have voted this up if it were an answer. I don't know PHP, so I couldn't say if it had PreparedStatements like Java does. It will certainly help, because you only parse and validate the SQL once. Commented Oct 1, 2011 at 14:45
  • ive never used prepared statements.. any chance of a decent article? also im using expression engine 1.6.8 and so i am using its database class rather than going straight to the database myself.. Commented Oct 1, 2011 at 16:56
  • For this type of usage, you might want to consider doing it as a straight procedural script; the MVC pattern wasn't really built for this type of job. If you truly want to speed the whole process up, you'll need to strip out what you don't need and use appropriate tools. One of those is using prepared statements. Commented Oct 1, 2011 at 19:02

3 Answers 3

3

Every time you insert a record is another round trip to the database.

I wonder if your life would be better if you could batch those SQL commands into a single round trip and execute them all at once? You'd cut down on the network latency that way.

The best way to figure out how to optimize anything is to have some hard data as to where the time is being spent. Find out what's taking the most time, change it, and re-measure. Repeat the exercise until you get acceptable performance.

I don't see any data from you. You're just guessing, and so is everyone else who answers here (including me).

Sign up to request clarification or add additional context in comments.

5 Comments

I was thinking the same thing (on all four points). What's the old trope about optimization without insight?
That's called transaction and prepared statements.
@duffymo - i would love to do that but i need to get the ID of the parent (if it exists, if not create it and then get ID) and pass it to the sub statements, can i do that all at once?
You'll have to do it as part of the transaction, not all at once. It's got to be one unit of work. Did you get those timings yet?
no not yet (currently at home and its on work dev pc..) from estimating earlier it takes about 2 mins just to get the XML from source. however i will be popping in tomorrow to sort out timing it all.
2

I'll write it as algorithm.

Store the first array inside of a new variable. $xmlArray;
SELECT the table to compare against from the databse and store it in a variable. $tableArray
foreach through $xmlArray and compare against $tableArray
Save the needed updates into a new array, $diffArray;
Prepare a statement using PDO prepare() and bindParam()
foreach through $diffArray, change the parameters only and execute()

This should be the most efficient way to do what you need.

Comments

0

I guess the simplest and one of the way is create batch array for insertion of say around 1500 records and do a batch insert. I tried this with 2k insert in while loop and single insert then it took 27 secs to insert 2000 records but with a single insert batch it took only .7 sec...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.