1

Is it possible using just SQL and MySQL to get the "OUTPUT" below?

SAMPLE DATA: To better elaborate with an example, lets assume I am trying to load a file containing employee name, the offices they have occupied in the past and their Job title history separated by a tab.

File:

EmployeeName<tab>OfficeHistory<tab>JobLevelHistory
John Smith<tab>501<tab>Engineer
John Smith<tab>601<tab>Senior Engineer
John Smith<tab>701<tab>Manager
Alex Button<tab>601<tab>Senior Assistant
Alex Button<tab>454<tab>Manager

NOTE: The single table database is completely normalized (as much as a single table may be) -- and for example, in the case of "John Smith" there is only one John Smith; meaning there are no duplicates that would lead to conflicts in referential integrity.

The MyOffice database schema has the following tables:

Employee (nId, name)
Office (nId, number)
JobTitle (nId, titleName)
Employee2Office (nEmpID, nOfficeId)
Employee2JobTitle (nEmpId, nJobTitleID)

OUTPUT: So in this case. the tables should look like:

Employee
1 John Smith
2 Alex Button

Office
1 501
2 601
3 701
4 454

JobTitle
1 Engineer
2 Senior Engineer
3 Manager
4 Senior Assistant

Employee2Office
1 1
1 2
1 3
2 2
2 4

Employee2JobTitle
1 1
1 2
1 3
2 4
2 3

Here's the MySQL DDL to create the database and tables:

create database MyOffice2;

use MyOffice2;

CREATE TABLE Employee (
      id MEDIUMINT NOT NULL AUTO_INCREMENT,
      name CHAR(50) NOT NULL,
      PRIMARY KEY (id)
    ) ENGINE=InnoDB;

CREATE TABLE Office (
  id MEDIUMINT NOT NULL AUTO_INCREMENT,
  office_number INT NOT NULL,
  PRIMARY KEY (id)
) ENGINE=InnoDB;

CREATE TABLE JobTitle (
  id MEDIUMINT NOT NULL AUTO_INCREMENT,
  title CHAR(30) NOT NULL,
  PRIMARY KEY (id)
) ENGINE=InnoDB;

CREATE TABLE Employee2JobTitle (
  employee_id MEDIUMINT NOT NULL,
  job_title_id MEDIUMINT NOT NULL,
  FOREIGN KEY (employee_id) REFERENCES Employee(id),
  FOREIGN KEY (job_title_id) REFERENCES JobTitle(id),
  PRIMARY KEY (employee_id, job_title_id)
) ENGINE=InnoDB;

CREATE TABLE Employee2Office (
  employee_id MEDIUMINT NOT NULL,
  office_id MEDIUMINT NOT NULL,
  FOREIGN KEY (employee_id) REFERENCES Employee(id),
  FOREIGN KEY (office_id) REFERENCES Office(id),
  PRIMARY KEY (employee_id, office_id)
) ENGINE=InnoDB;
12
  • 1
    im sure you probably could using a temp table somehow, but i would think it much easier to write a shell script for a widely installed interpreter (sh, bash, php, python, perl, etc..) Commented Jan 29, 2011 at 21:23
  • I don't really understand what you are trying to dp. Do you want a single insert query that will insert all of the loaded data into all of the tables at once? Could you please explain a bit further what you want to do. Commented Jan 29, 2011 at 21:24
  • 1
    Sorry, but I'm having difficulties to follow what exactly you are trying to achieve here. Do you wish to import the data into the already created db or create the db on the fly or what exactly? Commented Jan 29, 2011 at 21:27
  • 1
    @blunders: well it woud be one thing if you wanted to load the data file into a table that has the same columns as the data. you can do that out of the box (see hade's answer)... but there isnt a mechanism for creating multiple related records from a flat data file. You have to script that... you could probably script it in SQL directly, but like i said unless you dont have the other languages available I cant see much advantage to using sql over something else. Commented Jan 29, 2011 at 21:49
  • 1
    What you wrote under "OUTPUT: So in this case. the tables should look like:" is exactly how your database should be laid out. It's not normalised at all at the moment. That's why you're surprised about MySQL not being able to do it for you: it's not supposed to. Commented Jan 29, 2011 at 22:37

3 Answers 3

2

You can use a pass through table, and a trigger for this. Periodically, or from your calling app, delete from this table whenever you're done with it.

create table TmpEmp (
EmployeeName char(50) not null,
OfficeHistory int null,
JobLevelHistory char(30) null);

Create a trigger on this table

delimiter |
CREATE TRIGGER tg_TmpEmp BEFORE INSERT ON TmpEmp
FOR EACH ROW
BEGIN
IF not exists (select * from Employee where Name = NEW.EmployeeName) THEN
    INSERT INTO Employee(name)
        select NEW.EmployeeName;
END IF;
IF not exists (select * from Office where office_number = NEW.OfficeHistory) THEN
    INSERT INTO Office(office_number)
        select NEW.OfficeHistory;
END IF;
IF not exists (select * from JobTitle where title = NEW.JobLevelHistory) THEN
    INSERT INTO JobTitle(title)
        select NEW.JobLevelHistory;
END IF;
INSERT INTO Employee2JobTitle(employee_id,job_title_id)
    select E.id, T.id
    from Employee E
    inner join JobTitle T on T.title = NEW.JobLevelHistory
    where E.Name = NEW.EmployeeName
        AND not exists (select *
            from Employee2JobTitle J
            where J.employee_id = E.id and J.job_title_id = T.id);
INSERT INTO Employee2Office(employee_id,office_id)
    select E.id, O.id
    from Employee E
    inner join Office O on O.office_number = NEW.OfficeHistory
    where E.Name = NEW.EmployeeName
        AND not exists (select *
            from Employee2Office J
            where J.employee_id = E.id and J.office_id = O.id);
END; |
delimiter ;

Note: The benefit of this trigger and table is that it works whether you are using LOAD-FILE or just plain inserts. The trigger gets fired and adds data where it needs to.

Test it

insert tmpEmp(EmployeeName,OfficeHistory,JobLevelHistory)
select 'John Smith',501,'Engineer' union all
select 'John Smith',601,'Senior Engineer' union all
select 'John Smith',701,'Manager' union all
select 'Alex Button',601,'Senior Assistant' union all
select 'Alex Button',454,'Manager';

truncate table tmpEmp;
Sign up to request clarification or add additional context in comments.

2 Comments

@cyberkiki: wow, about to fall asleep, but going to run your code first thing in the morning.... Thanks for posting!!
+1 and selected as answer. Don't 100% understand the code, but knowing it's possible is half the battle -- thanks!! So, in rolling out the code, have a one suggestion for others using your answer, run my MySQL DDL to create the database and tables first before running the cyberkiwi, otherwise the answer's code will run without error, but not work. Again, thank you cyberkiwi -- huge help, since I know this isn't the best way to code it, but really wanted to understand what's possible in SQL.
1

Maybe you could get it working by using MySQL LOAD DATA INFILE syntax.

Accoring to the specification you can use it like this:

LOAD DATA INFILE 'data.txt' INTO TABLE db2.my_table;

and setting options like this:

FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
LINES TERMINATED BY '\n' STARTING BY ''

EDIT: Added one proposal:

1) Load the file into a temp file, let's call it table temp (left out in this example)
2) Insert basic data into right tables
  INSERT INTO Employee (name)   
  Select distinct name from temp;

  INSERT INTO Office (office_number)
  Select DISTINCT office from temp;

  INSERT INTO JobTitle (title)
  Select DISTINCT job_level from temp;

3) Create mapping tables by using joins, like:

  INSERT INTO Employee2Office (employee_id, office_id)
  select Employee.id, office.id from temp
  INNER JOIN Employee ON temp.name = Employee.name
  INNER JOIN Office ON temp.office = Office.office_number

  Follow the same approach for the other mapping table. 

5 Comments

That only works for a single table input rendering to a single table in the database. In my example, it's a single table input that's being inserted on a row by row basis. Each row insert checks to see if the data already exist before doing an insert. and passes the PRIMARY KEY from each insert to the composite tables Employee2JobTitle and Employee2Office; to support a many-to-many relationship. Possible that's I'm misunderstanding your answer, since I'm no SQL even novice, but that's my read of it. Thanks for posting!!
@blunders, you're right. How about just dumping everything into a temp table first, like @prodigitalson suggested in the comments. From there you could go through the table to insert data into appropriate tables. Sorry, no decent help here...
+2 Not a big deal, up voted both your comments, since you are trying to help; can't up vote the answer since it doesn't directly answer my question, but I also failed to include that I knew a one-to-one upload was possible. As for the temp table suggestion, I just don't understand how a temp table would differ from a "real" table; meaning I know what a temp table is, just don't get how it'd make a difference in the load process.
@blunders: I've added one more solution that could work. I tried it just quickly and it seemed to construct the Employee2Office mapping table just right. I hope it's even close :).
+1 Cool, thanks - I've been thinking about you're update, trying to sort it out, I'll reply again once I'd had the chance to sort it all out and try it... :-)
0

As for the temp table suggestion, I just don't understand how a temp table would differ from a "real" table; meaning I know what a temp table is, just don't get how it'd make a difference in the load process.

The temp table would allow you to load the data into a single flat table. For example you could implement the following process in SQL:

  1. Turn off foreign key checks
  2. Create a temp table to hold the data in the flat file as is
  3. Load the flat file data into the temp table
  4. use INSERT INTO ... SELECT FROM to load the data into your main tables (Employee, JobTitle, Office - so 3 queries)
  5. Use a query to select the two main table's auto increment columns based on the correlation of values in the main table and the temp table, and insert those into your join table (you'll do this twice, once for each join table)
  6. Turn foreign key checks back on

This is what i meant by it has to be scripted. There is no way that MySQL can magically map the relationships from flat data. You need to do that yourself. You could write it in SQL using the steps above but it just seems like it would be simpler to use a scripting language you're familiar with instead which avoids all kinds of possible permissions/access issues with using LOAD DATA.

2 Comments

Turning off foreign key checks is not an option, since it kills the referential integrity; meaning if you ever turn off that control, even if you turn it back on you'll never know if the database's state as it relates to referential integrity. Reason being that any transactions that take place while the checks are turned off will not be checked when the the control is turned back on; meaning the referential integrity may be broken, but the database is not setup to know it is; at least that's the way MySQL works.
@blunders: well you can leave it on if youd like. I normally disable them for a situation like this because there wont be an instance where a constraint breaks (assuming the data file is as clean as you claim). Then gain I wouldnt be doing this in raw SQL either :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.