96

I have a table with approx 5 million rows which has a fk constraint referencing the primary key of another table (also approx 5 million rows).

I need to delete about 75000 rows from both tables. I know that if I try doing this with the fk constraint enabled it's going to take an unacceptable amount of time.

Coming from an Oracle background my first thought was to disable the constraint, do the delete & then reenable the constraint. PostGres appears to let me disable constraint triggers if I am a super user (I'm not, but I am logging in as the user that owns/created the objects) but that doesn't seem to be quite what I want.

The other option is to drop the constraint and then reinstate it. I'm worried that rebuilding the constraint is going to take ages given the size of my tables.

Any thoughts?

edit: after Billy's encouragement I've tried doing the delete without changing any constraints and it takes in excess of 10 minutes. However, I have discovered that the table from which I'm trying to delete has a self referential foreign key ... duplicated (& non indexed).

Final update - I dropped the self referential foreign key, did my delete and added it back in. Billy's right all round but unfortunately I can't accept his comment as the answer!

4
  • 4
    If it's taking that long, even with 5 million rows, then you have something setup wrong. Commented Apr 21, 2010 at 2:07
  • What? The delete or the reenabling the constraint? And yes, it's quite possible something(s) is set up wrongly or in a less than optimised manner - the database has pretty much been 'built' by hibernate (I had nothing to do with that). Commented Apr 21, 2010 at 2:10
  • 14
    The delete. FK checks from indexed tables takes linear time, and removing 75000 + 75000 rows = 150 000 rows. Consider a worst case 19 comparisons per FK check (Binary search, lg(5 million) == 19), and perhaps 20 machine comparisons per row comparison, equaling 57 000 000 comparisons. Considering a conservative estimate of the average machine being able to do a billion comparisons a second, easy, this still should take less than a second of CPU time. Loading from the disk also shouldn't be a major issue because even at 5 million rows the table should fit in RAM. Commented Apr 21, 2010 at 2:15
  • 1
    OK Billy - I'll give the straight delete another go ... I'm pretty sure when I last tried it (this is work I've come back to after a month or so) it was very slow. Commented Apr 21, 2010 at 2:19

6 Answers 6

79

Per previous comments, it should be a problem. That said, there is a command that may be what you're looking to - it'll set the constraints to deferred so they're checked on COMMIT, not on every delete. If you're doing just one big DELETE of all the rows, it won't make a difference, but if you're doing it in pieces, it will.

SET CONSTRAINTS ALL DEFERRED

is what you are looking for in that case. Note that constraints must be marked as DEFERRABLE before they can be deferred. For example:

ALTER TABLE table_name
  ADD CONSTRAINT constraint_uk UNIQUE(column_1, column_2)
  DEFERRABLE INITIALLY IMMEDIATE;

The constraint can then be deferred in a transaction or function as follows:

CREATE OR REPLACE FUNCTION f() RETURNS void AS
$BODY$
BEGIN
  SET CONSTRAINTS ALL DEFERRED;

  -- Code that temporarily violates the constraint...
  -- UPDATE table_name ...
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
Sign up to request clarification or add additional context in comments.

4 Comments

Certainly worth a try, but I'm not convinced that deferred constraints are any faster. AFAIK they just shift the validation work from DELETE-time to COMMIT-time.
I would have given this a go but dropping the fk and reinstating it worked. Like intgr, I wonder if it would not just change the checking of the fk to commit time so I'll definitely remember it for next time.
I dropped a database and re-imported it after running SET CONSTRAINTS ALL DEFERRED. Is there a way to "re-enable" these constraints once the import is done? It's a pretty huge file, so it would be pretty hard to re-order the table creation. I got around this before by importing the data twice.
I've never tried this, but I have tried dropping constraints then re-adding them after, and it was much faster than leaving them in while I delete rows.
52

What worked for me was to disable one by one the TRIGGERS of those tables that are gonna be involved in the DELETE operation.

ALTER TABLE reference DISABLE TRIGGER ALL;
DELETE FROM reference WHERE refered_id > 1;
ALTER TABLE reference ENABLE TRIGGER ALL;

Solution is working in version 9.3.16. In my case time went from 45 minutes to 14 seconds executing DELETE operations.

As stated in the comments section by @amphetamachine, you will need to have admin privileges to the tables to perform this task.

7 Comments

Note that the PostgreSQL user executing the ALTER TABLE commands must be the owner of that table.
Will that disable fk constraints also?
For me, executing the ALTER TABLE command was incredibly slow.
This did not disable the UNIQUE validation, or else I did something wrong (maybe UNIQUE constraint is not managed as a "TRIGGER")
@TheRedPea : Maybe it was a non defferrable unique constraint ... this page has more information : How can I disable all contraints in my postgresql? : "... Non-deferrable primary key, unique and exclusion constraints have no associated triggers and are not affected."
|
31

If you try DISABLE TRIGGER ALL and get an error like permission denied: "RI_ConstraintTrigger_a_16428" is a system trigger (I got this on Amazon RDS), try this:

set session_replication_role to replica;

If this succeeds, all triggers that underlie table constraints will be disabled. Now it's up to you to make sure your changes leave the DB in a consistent state!

Then when you are done, reenable triggers & constraints for your session with:

set session_replication_role to default;

4 Comments

This doesn't work with the latest version of PostereSql
Works fine with Postgres 12. Do note you need to be superuser to set session_replication_role though.
I am using root already. Not working for me
The user "root" may not be a superuser in postgresql. The default superuser is usually called "postgres" so you could try that. You could even use the "createuser --superuser" command to make a new superuser in the database just for this task and then drop the user afterward.
9

(This answer assumes your intent is to delete all of the rows of these tables, not just a selection.)

I also had to do this, but as part of a test suite. I found the answer, suggested elsewhere on SO. Use TRUNCATE TABLE as follows:

TRUNCATE TABLE <list-of-table-names> [RESTART IDENTITY] [CASCADE];

The following quickly deletes all rows from tables table1, table2, and table3, provided that there are no references to rows of these tables from tables not listed:

TRUNCATE TABLE table1, table2, table3;

As long as references are between the tables listed, PostgreSQL will delete all the rows without concern for referential integrity. If a table other than those listed references a row of one of these tables, the query will fail.

However, you can qualify the query so that it also truncates all tables with references to the listed tables (although I have not tried this):

TRUNCATE TABLE table1, table2, table3 CASCADE;

By default, the sequences of these tables do not restart numbering. New rows will continue with the next number of the sequence. To restart sequence numbering:

TRUNCATE TABLE table1, table2, table3 RESTART IDENTITY;

Comments

7

My PostgreSQL is 9.6.8.

set session_replication_role to replica;

work for me but I need permission.

I login psql with super user.

sudo -u postgres psql

Then connect to my database

\c myDB

And run:

set session_replication_role to replica;

Now I can delete from table with constraint.

Comments

-12

Disable all table constraints

ALTER TABLE TableName NOCHECK CONSTRAINT ConstraintName

-- Enable all table constraints

ALTER TABLE TableName CHECK CONSTRAINT ConstraintName

2 Comments

Question was about Postgresql which doesn't have that capability (as of v9.4).
Agree v9.4 hasn't this feature ERROR: syntax error at or near "NOCHECK" LINE 1: ALTER TABLE TableName NOCHECK CONSTRAINT ConstraintName

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.