9

I've to duplicate values from one table to another (identical table schemes). What is better (performance):

  • Drop table1 and create as select * from table2
  • Delete all rows from table1 and insert all rows from table2

Update: I've made a small test on table with almost 3k rows. Drop and create gives about 60ms vs Delete and insert - about 30ms.

3
  • My intuition tells me the fastest way would be truncate and insert, since delete scans every row and deletes them individually, whereas truncate just plain empties the table with no possible conditions. Commented Aug 11, 2011 at 8:08
  • 2
    3k rows.... seriously?.... And your are talking about performance?... Premature optimization anyone? When I read your initial post, I thought you were talking about several millions of rows. 3k Rows is nothing. For 3k rows you probably don't even need a DB ;) Commented Aug 11, 2011 at 8:19
  • 3
    Well depends on how often he needs to do it, and how often concurrent transactions need to access it, doesn't it? :) Commented Aug 11, 2011 at 11:46

4 Answers 4

18

I see four useful ways to replace the contents of the table. None of them is "obviously right", but it depends on your requirements.

  1. (In a single transaction) DELETE FROM foo; INSERT INTO foo SELECT ...

    Pro: Best concurrency: doesn't lock out other transactions accessing the table, as it leverages Postgres's MVCC.

    Con: Probably the slowest if you measure the insert-speed alone. Causes autovacuum to clean up dead rows, thus creating a higher I/O load.

  2. TRUNCATE foo; INSERT INTO foo SELECT ...

    Pro: Fastest for smaller tables. Causes less write I/O than #1

    Con: Excludes all other readers -- other transactions reading from the table will have to wait.

  3. TRUNCATE foo, DROP all indexes on table, INSERT INTO foo SELECT ..., re-create all indexes.

    Pro: Fastest for large tables, because creating indexes with CREATE INDEX is faster than updating them incrementally.

    Con: Same as #2

  4. The switcheroo. Create two identical tables foo and foo_tmp

    TRUNCATE foo_tmp;
    INSERT INTO foo_tmp SELECT ...;
    ALTER TABLE foo RENAME TO foo_tmp1;
    ALTER TABLE foo_tmp RENAME TO foo;
    ALTER TABLE foo_tmp1 RENAME TO foo_tmp;
    

    Thanks to PostgreSQL's transactional DDL capabilities, if this is done in a transaction, the rename is performed without other transactions noticing. You can also combine this with #3 and drop/create indexes.

    Pro: Less I/O performed, like #2, and without locking out other readers (locks taken only during the rename part).

    Con: The most complicated. Also you cannot have foreign keys or views pointing to the table, as they would point to the wrong table after renaming it.

Sign up to request clarification or add additional context in comments.

5 Comments

I had to rename/drop my indexes after doing the combined #4 and #3. Primary-key indexes are renamed automatically, others are not. The total time to drop and rebuild ~800.000 rows from a view went from 90s to about 20s. Thanks for the tip.
does this answer still hold with the latest versions of postgresql like 10+?
@PirateApp There's an additional method of doing this using INSERT ... ON CONFLICT UPDATE etc, with its own tradeoffs. Apart from that, there have been smaller optimizations, but what I wrote is still relevant.
What is the effect of dropping indexes? Aren't indexes automatically dropped when we truncate foo?
@skan It's not about TRUNCATE. Indexes make the INSERT commands run slower. If you drop indexes, then INSERT all data, then re-create indexes, then PostgreSQL will use bulk index creation, which is much faster in total time than handling it at INSERT time.
2

Use TRUNCATE instead of DROP TABLE or DELETE when you have to get rid of all records in a table. With TRUNCATE you can still use triggers in PostgreSQL and permissions are easier to set and maintain.

Like a DROP, TRUNCATE also needs a table lock.

Comments

2

Here is the (comparative) timings for the intgr's answer (see the code below):

  1. delete/insert - 36 sec.
  2. truncate/insert - 19 sec.
  3. drop index/truncate/insert/create index - 13 sec.

    -- preparations
    drop table if exists temp_refresh_experiment;
    -- million random strings
    create table temp_refresh_experiment as
    select 
        upper(substr(md5(random()::text), 0, 25)) as some_column
    FROM
        generate_series(1,1000000) i;
    -- create index
    create index temp_refresh_experiment_ix on temp_refresh_experiment(some_column)
    ;
    
    
    -- 1. delete/insert
    delete from temp_refresh_experiment;
    insert into temp_refresh_experiment(some_column)
    select
    upper(substr(md5(random()::text), 0, 25)) as some_column
    FROM
        generate_series(1,1000000) i;
    -- 36 secs
    
    
    -- 2. truncate/insert
    truncate temp_refresh_experiment;
    insert into temp_refresh_experiment(some_column)
    select
    upper(substr(md5(random()::text), 0, 25)) as some_column
    FROM
        generate_series(1,1000000) i;
    -- 19 sec   
    
    
    -- 3. drop index/truncate/insert/create index
    drop index if exists temp_refresh_experiment_ix;
    truncate temp_refresh_experiment;
    insert into temp_refresh_experiment(some_column)
    select
    upper(substr(md5(random()::text), 0, 25)) as some_column
    FROM
        generate_series(1,1000000) i; 
    create index temp_refresh_experiment_ix on temp_refresh_experiment(some_column)
    ;
    -- 13 sec
    

Comments

1

In case you are talking about executing the INSERTs manually, one by one, then DROP/CREATE will be much faster. Also, when using CREATE TABLE AS, it will only copy the column definitions. Indices, and other constraints will not be copied. This will speed up the copy process enormously. But you'll have to remember to re-create these on the new copy once you're finished.

The same goes for SELECT INTO. They are functionally identical. They just have different names.

In any case. When copying large tables, always disable triggers, indices, and constraints to gain performance.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.