3

I currently have 2000 records in a postgresql database being updated every minute that are filtered with a SQL statement. Upto 1000 different filter combinations can exist and approx 500 different filters can be called every minute. At the moment http responses are cached for 59 seconds to ease server load and database calls. However im considering caching the whole db table in memcached and doing the filtering in php. 2000 rows isnt alot but the response time for getting data from memory vs the db would be alot faster.

Would the php processing time outweigh the database response time for sql filtering for this number of rows? The table shouldnt grow anymore than 3000 rows in the foreseeable future.

2
  • I wouldn't say I have a desire to perform the task in php. If I would get no performance benefit then i would have no desire at all. Commented Feb 21, 2012 at 1:57
  • @shapeshifter: You'll want to add 'memcached' tag to this question. Also, as Michael said, results are dependent on your own environment. Commented Feb 21, 2012 at 2:06

2 Answers 2

5

As with any question relating to is x faster than y, the only real answer is to benchmark it for yourself. However, if the database is properly indexed for the queries you need to perform, it is likely to be quite a bit faster at filtering result sets than most any PHP code you could write.

The RDBMS is on the other hand, is already designed and optimized for locating, filtering, and ordering rows.

Sign up to request clarification or add additional context in comments.

1 Comment

Also, there's no reason the database couldn't just be given enough memory to keep the entire table in memory.
1

The way PostgreSQL operates, if you aren't extremely starving it for memory, 100% of such a small and frequently queried table will be held in RAM (Cache) already by the default caching algorithms. Having the database engine filter it is almost certainly faster than doing the same it in your application.

You may want to inspect your postgresql.conf, especially shared_buffers, the planner cost constants (set random_page_cost almost or exactly as low as seq_page_cost) and effective_cache_size (set it high enough).

You could probably benefit from optimizing indexes. There is a wide range of types available. Consider partial indexes, indexes on expression or multi-column indexes in addition to plain indexes. Test with EXPLAIN ANALYZE and only keep indexes that actually get used and speed up queries. As all of the table resides in RAM, the query planner should calculate that random access is almost or exactly as fast as sequential access. The difference only applies to disc reads.

As you updating every minute, be sure not to keep any indexes that aren't actually helping. Also, vacuuming and analyzing it frequently are keys to performance in such a case. Not VACUUM FULL ANALYZE, just VACUUM ANALYZE. Or use auto-vacuum with tuned settings.

Of course, all the standard advice on performance optimization applies.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.