Weirdly I have done a lot of development with mySQL and never encountered some of the things I have encountered todays.
So, I have a user_items table
ID | name
---------
1 | test
I then have an item_data table
ID | item | added | info
-------------------------
1 | test | 12345 | important info
2 | test | 23456 | more recent important info
I then have an emails table
ID | added | email
1 | 12345 | [email protected]
2 | 23456 | [email protected]
3 | 23456 | [email protected]
and an emails_verified table
ID | email
-----------
1 | [email protected]
Now I appreciate the setup of these tables may not be efficient etc, but this cannot be changed, and is a lot more complex than it may seem.
What i want to do is as follows. I want to be able to search through a users items and display the associated info, as well as any emails associated, as well as displaying if the email has been verified.
user_items.name = item_data.item
item_data.added = emails.added
emails.email = emails_verified.email
So for user item 1, test. I want to be able to return its ID, its name, the most recent information, the most recent emails, and their verification status.
So I woud like to return
ID => 1
name => test
information => more recent important info
emails => array('0' => array('email' => '[email protected]' , 'verified' => 'YES'),'1' => array('email' => '[email protected]' , 'verified' => 'NO'))
Now I could do this with multiple queries with relative ease. My research however suggests that this is significantly more resource/time costly then using one (albeit very complex) mysql query with loads of join statements.
The reason using one query would also would be useful (I believe) is because I can then add search functionality with relative ease - adding to the query complex where statements.
To further complicated matters I am using CodeIgniter. I cannot be too picky :) so any none CI answers would still be very useful.
The code I have got thus far is as follows. It is however very much 'im not too sure what im doing'.
function test_search()
{
$this->load->database();
$this->db->select('user_items.*,item_data.*');
$this->db->select('GROUP_CONCAT( emails.email SEPARATOR "," ) AS emails', FALSE);
$this->db->select('GROUP_CONCAT( IF(emailed.email,"YES","NO") SEPARATOR "," ) AS emailed', FALSE);
$this->db->where('user_items.name','test');
$this->db->join('item_data','user_items.name = item_data.name','LEFT');
$this->db->join('emails','item_data.added = emails.added','LEFT');
$this->db->join('emailed','emails.email = emailed.email','LEFT');
$this->db->group_by('user_items.name');
$res = $this->db->get('user_items');
print_r($res->result_array());
}
Any help with this would be very much appreciated.
This is really complex sql - is this really the best way to achieve this functionality?
Thanks
UPDATE
Following on from Cryode's excellent answer.
The only thing wrong with it is that it only returns one email. By using GROUP_CONCAT however I have been able to get all emails and all email_verified statuses into a string which I can then explode with PHP.
To clarify is the subquery,
SELECT item, MAX(added) AS added
FROM item_data
GROUP BY item
essentially creating a temporary table?
Similar to that outlined here
Surely the subquery is necessary to make sure you only get one row from item_data - the most recent one?
And finally to answer the notes about the poorly designed database.
The database was designed this way as item_data is changed regularly but we want to keep historical records.
The emails are part of the item data but because there can be any number of emails, and we wanted them to be searchable we opted for a seperate table. Otherwise the emails would have to be serialized within the item_data table.
The emails_verified table is seperate as an email can be associated with more than one item.
Given that, although (clearly) complicated for querying it still seems a suitable setup..?
Thanks
FINAL UPDATE
Cryodes answer is a really useful answer relating to database architecture in general.
Having conceptualised this a little more, if we store the version id in user_items we dont need the subquery.
Because none of the data between versions is necessarily consistent we will scrap his proposed items table(for this case). We can then get the correct version from a item_data tables We can also get the items_version_emails rows based on the version id and from this get the respective emails from our 'emails' table.
I.E It works perfectly.
The downside of this is that when I add new version data in item_data I have to update the user_items table with the new version that has been inserted.
This is fine, but simply as a generalized point what is quicker? I assume the reason such a setup has been suggested is that it is quicker - an extra update each time new data is added is worth it to save potentially hundreds of subqueries when lots of rows are being displayed. Especially given that we display the data more than we update it.
Just for knowledge when in future designing database architecture does anyone have any links/general guidance on what is quicker and why such that we can all make better optimized databases.
Thanks again to Cryode !!
It is however very much 'im not too sure what im doing'.Define it What is wrong with te above code? A bad design none the less