4

I need to check if some integer value is already in my database (which is growing all the time). And it should be done several thousand times in one script. I'm considering two alternatives:

  1. Read all those numbers from MySQL database into PHP array and every time I need to check it, use in_array function.
  2. Every time I need to check the number, just execute something like SELECT number FROM table WHERE number='#' LIMIT 1

On the one hand, searching in array which is stored in RAM should be faster than querying mysql every time (as I have mentioned, these checks are performed about a thousand times during one script execution). On the other hand, DB is growing, ant that array may become quite big and that may slow things down.

Question is - which way is faster or better by some other aspects?

7
  • 6
    databases are built to be queried and searched, and sometimes they are also cached RAM. PHP's array functions are good at searching, but they are no match for a compiled database engine. Commented Aug 16, 2010 at 12:35
  • You do have a memory limit so putting everything in an array may not offer a solution at all. Caching is the only way to go if you're talking large databases, but you do need to establish acceptable freshness duration of data first. Commented Aug 16, 2010 at 12:36
  • How big is this database table? 10 rows? 1,000 rows? 1,000,000 rows? 1,000,000,000 rows? (The answer to that will make a huge difference in the optimal strategy)... Also, can you "batch" these numbers up (so instead of doing 3000 queries, only do 30 with each looking up 100 numbers)? Commented Aug 16, 2010 at 12:38
  • 2
    At that size, it's not clear cut on either side (a 5000 row int array would take up a fair amount of ram, but not a ridiculous amount), and the overhead of the array operations may be offset by the TCP overhead. So I think you're right in the sweet spot where both will be about the same. A little more data (say 50k+ rows) and the db will be faster. A little less (say 1k rows) and php might be faster. So I don't think speed will be the primary concern... Commented Aug 16, 2010 at 12:50
  • 1
    Hmm, it may also depend on the script then. In general, it depends on memory limit on your server and where the script will be used. Can you describe some more details about it? 5k is quite a little number for database, even if it's MySQL. On the other hand if your DB server resides on different machine, this may be a problem. I'd go with DB solution but trying to optimize number of queries. Commented Aug 16, 2010 at 12:53

4 Answers 4

1

I have to agree that #2 is your best choice. When performing a query with a LIMIT 1 MySQL stops the query when it finds the first match. Make sure the columns you intend to search by are indexed.

Sign up to request clarification or add additional context in comments.

Comments

1

It sounds like you are duplicating a Unique Constraint in code...

CREATE TABLE MyTable(
SomeUniqueValue    INT NOT NULL
CONSTRAINT MyUniqueKey UNIQUE (SomeUniqueValue));

1 Comment

No, I don't. I need to check it in my algorithm.
0

How does the number of times you need to check compare with the number of values stored in the database? If it's 1:100 then your probably better of searching in the database each time, if it's (some amount) less then preloading the list will be faster. What happened when you tested it?

However even if the ratio is low enough for it to be faster loading the full table, this will gobble up memory and could, as a result, make everything else run more slowly.

So I would recommend not loading it all into memory. But if you can, then batch the checks up to minimise the number of round trips to the database.

C.

Comments

0

querying the database is the best option, one because you said the database is growing so that means new values are being added to the table, whereis in in_array you would be reading old values. Secondly, you might exhaust the RAM alloted to PHP with very large amount of data. Thirdly, mysql has its own query optimizers and other optimizations which makes it a far better choice as compared to php

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.