This question is going out to all the mySQL gurus out there. I've been building out an ad network, and have been working on trying to optimize the main SQL call that grabs the ads. It checks for ads that meet the follow criteria:
- Geography (geo)
- Operating System (os)
- Device Type (device)
- Ad Status (enabled)
- If the conversion type of the ad falls inside the given conversion types (conversion)
- If the number of ad clicks per IP is less then the max frequency allowed (frequency)
- The subid is currently not in the blacklist for that ad (ws.nid IS NULL)
- If the ad has adometry enabled or not, and not show any ads if the IP/UA is on the adometry list and adometery for that ad is on
- Checks the current publishers conversion click through rate compared to the ads max click through rate, and doesnt show that ad if the current ads click through rate is over their max and the publishers click through rate is also over that max
- Checks the current ads budget compared to what it should be at (by their max daily budget) and not show any ads if its higher then what it should be.
- Does a keyword search depending on the match type set by the ad.
This is the current SQL query I am running:
SELECT
w.nid as w_nid,
w.uid as w_uid,
w.status as w_status,
w.landing_page as w_landing_page,
w.starting_bid as w_starting_bid,
w.daily_budget as w_daily_budget,
w.revshare as w_revshare,
w.filters as w_filters,
w.device as w_device,
w.os as w_os,
w.conversion as w_conversion,
w.max_ctr as w_max_ctr,
w.frequency as w_frequency,
w.ad_title as w_ad_title,
w.ad_desc as w_ad_desc,
w.ad_728x90 as w_ad_728x90,
w.ad_300x250 as w_ad_300x250,
w.ad_160x600 as w_ad_160x600,
w.match_type as w_match_type,
wg.nid as wg_nid,
wg.geo as wg_geo,
wk.keyword as wk_keyword,
wk.nid as wk_nid,
IFNULL(wcs.estimate,0) as wcs_spend,
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) as wcs_ctr,
pss.bid as pss_bid,
pss.ctr as pss_ctr,
wci.count as wci_count,
ws.*
FROM
websites w
LEFT JOIN
websites_geos wg
ON
wg.nid = w.nid
LEFT JOIN
websites_keywords wk
ON
wk.nid = w.nid
LEFT JOIN
api_bucket_website_daily_clicks wcs
ON
wcs.nid = w.nid AND wcs.date = CURDATE()
LEFT JOIN
publisher_subid_stats pss
ON
pss.uniq = CONCAT(w.nid,'_',:pid
,'_',:subid)
LEFT JOIN
websites_cur_ips wci
ON
wci.unique = CONCAT(CURDATE(),:ip,w.nid)
LEFT JOIN
websites_subids ws
ON
w.nid = ws.nid AND CONCAT(:pid,'_',:subid) = ws.subid
WHERE
(
(
match_type = 0 /* MATCH RON KEYWORDS */
)
OR
(
wk.keyword = :keyword /* MATCH EXACT KEYWORD */
AND
match_type = 1
)
OR
(
:keyword LIKE CONCAT('%',wk.keyword,'%') /* MATCH PHRASE KEYWORD */
AND
match_type = 2
)
OR
(
:keyword LIKE CONCAT('%',REPLACE(wk.keyword, ' ', '%'),'%') /* MATCH BROAD KEYWORD */
AND
match_type = 3
)
)
AND
wg.geo = :geo
AND
w.os = :os
AND
w.device = :device
AND
w.enabled = 1
AND
w.conversion IN (:conversiontype)
AND
((:sectoday/86400) * w.daily_budget) >= IFNULL(wcs.estimate,0)
AND
IFNULL(wci.count,0) < w.frequency
AND
ws.nid IS NULL
AND
((:adometry = 0) OR (:adometry = 1 AND w.filters = 0))
AND
(
(
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) <= w.max_ctr
AND
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) >= IFNULL(pss.ctr,0)
)
OR
(
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) >= IFNULL(pss.ctr,0)
)
)
ORDER BY
IFNULL(pss.bid,w.starting_bid) DESC, RAND()
I look at these query and I cry because even though it responds super fast now, we are planning on receiving north of a billion queries a day on it, with well over 500 advertisers and I want to make sure it is as optimized as possible. Also let me know if you require more info!