I have a database with several tables, the ones involved in this query that I want to optimize are only 4.
albums, songs, genres, genre_song
A song can have many genres, and a genre many songs. An album can have many songs. An album is related to genres through songs.
The objective is to be able to recommend albums related to the genre of the album.
So that led me to have this query.
SELECT *
FROM `albums`
WHERE EXISTS
(SELECT *
FROM `songs`
WHERE `albums`.`id` = `songs`.`album_id`
AND EXISTS
(SELECT *
FROM `genres`
INNER JOIN `genre_song` ON `genres`.`id` = `genre_song`.`genre_id`
WHERE `songs`.`id` = `genre_song`.`song_id`
AND `genres`.`id` IN (6)))
AND `id` <> 37635
AND `published` = 1
ORDER BY `release_date` DESC
LIMIT 6
This query takes me between 1.4s and 1.6s. I would like to reduce it as much as possible. The ideal goal would be less than 10ms 😁
I am already using index in several tables, I have managed to reduce times in other queries from up to 4 seconds to only 15-20ms. I am willing to use anything to reduce the performance to a minimum.
I am using Laravel, so this would be the query with Eloquent.
$relatedAlbums = Album::whereHas('songs.genres', function ($query) use ($album) {
$query->whereIn('genres.id', $album->genres->pluck('id'));
})->where('id', '<>', $album->id)
->orderByDesc('release_date')
->take(6)
->get();
Note: Previously, the genres were loaded.
If you want to recreate the tables and some fake data in your database, here is the structure
release_datefield on any table. 2. You executing a query with$album->genres->pluck('id'). 3. You should try running EXPLAIN on each individual query to make sure they are using an index.$album->genresdoes not make another query. 3 Since the beginning I have been doing it. Only the indexes don't work withEXISTS. That's why I'm here, to seek help.EXISTSis the bottleneck here. Mysql EXISTS is pretty performant. I would follow @Pablo's advice, and maybe share the result for us to have a look? And how large of a dataset are we talking about? Also, you mentioned that there are many fields. Depending on the type of fields, you might get a little edge by selecting only the required fields in the subqueries.