SELECT id_num
, sum(expired) as expired
, max(`date`) as max_date
FROM accounts
where `date`<=20170505
group by id_num;
The accounts table has a compound index on id_num,date and has about 100mil rows. This query seems pretty basic but it takes forever and I'm not sure how to break it up to speed things up. I thought about first creating a helper table for DISTINCT id_num (~3mil rows) but then I'm not sure how to get the sum(expired) and max(date) columns without joining the helper against the accounts table and doing the same thing as the original query.
CREATE TABLE `accounts` (
`id_num` int(11) NOT NULL,
`date` date NOT NULL,
`time` datetime NOT NULL,
`price` decimal(10,4) NOT NULL,
`cost` decimal(10,4) NOT NULL,
`time_slices` int(11) NOT NULL,
`sub_expired` tinyint(1) NOT NULL,
PRIMARY KEY (`id_num`,`date`),
KEY `date` (`date`),
CONSTRAINT `accounts_ibfk_1` FOREIGN KEY (`id_num`) REFERENCES `cust` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
The date range goes back roughly 18 months and the data is pretty evenly distributed over all dates. The rows are being inserted chronologically and there are rarely updates/deletes.
id_num, but never two on the same day? (I ask because the PK is rather unusual.)timethe full date+time? Anddateis just the date part oftime?id_num, but only one per day. Also,timeis full date+time anddateis just date part oftimeas you mentioned.