I was wondering if i could optimize it more, maybe someone struggled with that.
First of all I have table:
CREATE TABLE `site_url` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`url_hash` CHAR(32) NULL DEFAULT NULL,
`url` VARCHAR(2048) NULL DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `url_hash` (`url_hash`)
)
ENGINE=InnoDB;
where I store site URI (domain is in different table, but for purpose of this question id doesn't matter - I hope)
url_hash is MD5 calculated from url
It seems that all fields are in good length, indexes should be correct but there are a lat of data in it and I'm looking for more optimization.
Standard query looks like this:
select id from site_url where site_url.url_hash = MD5('something - often calculated in application rather than in mysql') and site_url.url = 'something - often calculated in application rather than in mysql'
describe gives:
+----+-------------+----------+------+---------------+----------+---------+-------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+----------+---------+-------+------+------------------------------------+
| 1 | SIMPLE | site_url | ref | url_hash | url_hash | 97 | const | 1 | Using index condition; Using where |
+----+-------------+----------+------+---------------+----------+---------+-------+------+------------------------------------+
But I'm wondering if I could help mysql doing that search. It must by InnoDB engine, I can't add key to url because of it's length
Friend of mine told me to short up hash to 16 chars, and write it as number. Will index on BIGINT be faster than on char(32)? Friend also suggested to do MD5 and take 16 first/last chars of it but I think it will make a lot more collisions.
What are your thoughts about it?
url_hashtobinary(16). An integer won't be large enough to store the hash as number. That should give you some more space. Also, optimizing MySQL will help immensely. Look up yourinnodb_buffer_pool_sizevariable and google around to see what people are doing with it to turbocharge MySQL performance.insert into site_url (url_hash,url) values (UNHEX(md5('/uri')),'/uri');and SELECT:SELECT id FROM site_url USE INDEX (url_hash) WHERE url_hash = UNHEX(md5('/uri')) AND url = '/uri';