SQL Insert into and Select multiple columns?

Question

So I have a tables that look something like this:

Communication: (Calls made)

Timestamp            FromIDNumber ToIDNumber GeneralLocation 
2012-03-02 09:02:30  878          674        Grasslands 
2012-03-02 11:30:01  456          213        Tundra 
2012-03-02 07:02:12  789          654        Mountains
2012-03-02 08:06:08  458          789        Tundra

And I want to create a new table that has all the distinct FromIDNumber and ToIDNumber's.

This is the SQL Fiddle for it.

This works:

INSERT INTO CommIDTemp (`ID`)
SELECT DISTINCT Communication.FromIDNumber
FROM Communication
UNION DISTINCT 
SELECT DISTINCT Communication.ToIDNumber
FROM Communication;

and I got:

But I wonder if there is more efficient way, because the dataset that I have has millions and millions of lines and I didn't know about the performance of UNION DISTINCT.

I originally tried something like

INSERT INTO CommIDTemp (`ID`) 
SELECT DISTINCT Communication.FromIDNumber
AND Communication.ToIDNumber 
FROM Communication;

but that didn't work... is there any other way to do this more efficiently? I'm pretty new to SQL, so any help would be greatly appreciated, thank you!!

A and B will try to insert the logical AND result of two strings. select 'a' and 'b' -> result = 0. — Marc B
– Marc B, Commented Jun 2, 2015 at 21:50
This is a one-time task? So it does not really matter how long it takes? What will you do about adding new values as more data comes in? — Rick James
– Rick James, Commented Jun 5, 2015 at 5:51

Philipp · Accepted Answer · 2015-06-02 22:25:58Z

3

First thing: I do not have experience with this big tables. So you have to test out the following tipps yourself to find out if they are really working in your situation:

1. Create index in the source table

Make sure that both columns FromIDNumber and ToIDNumber have an index, i.e.

ALTER TABLE Communication ADD INDEX (FromIDNumber);
ALTER TABLE Communication ADD INDEX (ToIDNumber);

2. Try to remove DISTINCT

I could not find a faster query for your example, though you might try the query without the DISTINCT keyword - using UNION returns only distinct values by definition. So this SQL gives us the same result as your current query:

INSERT INTO CommIDTemp (`ID`)
SELECT FromIDNumber FROM Communication
UNION 
SELECT ToIDNumberFROM Communication;

3. Use a primary key in the temp table

Also try another approach by setting the CommIDTemp.ID column as a primary key and use INSERT IGNORE - this is especially useful if you want to update the table frequently without deleting the contents:

CREATE TABLE CommIDTemp (ID INT PRIMARY KEY);

INSERT IGNORE INTO CommIDTemp (`ID`)
SELECT FromIDNumber FROM Communication
UNION
SELECT ToIDNumber FROM Communication;

answered Jun 2, 2015 at 22:25

Philipp

11.5k9 gold badges69 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rick James Over a year ago

UNION defaults to DISTINCT, so that won't make any difference. The other option is UNION ALL, but that may give you duplicates.

kheld · Accepted Answer · 2015-06-02 22:30:23Z

2

Performance is mainly going to depend on how the table is indexed. I don't see a way to do everything in one pass so I would suggest separate indexes on FromIDNumber and ToIDNumber. That should make each statement in your union very fast even for a lot of rows.

You can make this faster by only using one DISTINCT statement. EachDISTINCT requires a sort/temp table. You can drop the DISTINCT from each statement and the UNION DISTINCT will make sure you get distinct values.

INSERT INTO CommIDTemp (`ID`)
SELECT Communication.FromIDNumber
FROM Communication
UNION DISTINCT 
SELECT Communication.ToIDNumber
FROM Communication;

Side Note: UNION ALL is faster than UNION DISTINCT but based on your requirements you need UNION DISTINCT which can be written as simply UNION.

edited Jun 2, 2015 at 22:30

answered Jun 2, 2015 at 22:13

kheld

7925 silver badges14 bronze badges

Collectives™ on Stack Overflow

SQL Insert into and Select multiple columns?

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related