15

I have a deferred AFTER UPDATE trigger on a table, set to fire when a certain column is updated. It's an integer type I'm using as a counter.

I'm not 100% certain but it looks like if I increment that particular column 100 times during a transaction, the trigger is queued up and executed 100 times at the end of the transaction.

I would like the trigger to only be scheduled once per row no matter how many times I've incremented that column.

Can I do that somehow? Alternatively if triggered triggers must queue up regardless if they are duplicates, can I clear this queue during the first run of the trigger?

Version of Postgres is 9.1. Here's what I got:

CREATE CONSTRAINT TRIGGER counter_change
    AFTER UPDATE OF "Counter" ON "table"
    DEFERRABLE INITIALLY DEFERRED
    FOR EACH ROW
    EXECUTE PROCEDURE counter_change();

CREATE OR REPLACE FUNCTION counter_change()
    RETURNS trigger
    LANGUAGE plpgsql
    AS $$
DECLARE
BEGIN

PERFORM some_expensive_procedure(NEW."id");

RETURN NEW;

END;$$;
1
  • Your version of Postgres would help. Also the (basic) code of your trigger and trigger function. Commented Jan 21, 2012 at 1:07

4 Answers 4

18

This is a tricky problem. But it can be done with per-column triggers and conditional trigger execution introduced in PostgreSQL 9.0.

You need an "updated" flag per row for this solution. Use a boolean column in the same table for simplicity. But it could be in another table or even a temporary table per transaction.

The expensive payload is executed once per row where the counter is updated (once or multiple time).

This should also perform well, because ...

  • ... it avoids multiple calls of triggers at the root (scales well)
  • ... does not change additional rows (minimize table bloat)
  • ... does not need expensive exception handling.

Consider the following

Demo

Tested in PostgreSQL 9.1 with a separate schema x as test environment.

Tables and dummy rows

-- DROP SCHEMA x;
CREATE SCHEMA x;

CREATE TABLE x.tbl (
 id int
,counter int
,trig_exec_count integer  -- for monitoring payload execution.
,updated bool);

Insert two rows to demonstrate it works with multiple rows:

INSERT INTO x.tbl VALUES
 (1, 0, 0, NULL)
,(2, 0, 0, NULL);

Trigger functions and Triggers

1.) Execute expensive payload

CREATE OR REPLACE FUNCTION x.trg_upaft_counter_change_1()
    RETURNS trigger AS
$BODY$
BEGIN

 -- PERFORM some_expensive_procedure(NEW.id);
 -- Update trig_exec_count to count execution of expensive payload.
 -- Could be in another table, for simplicity, I use the same:

UPDATE x.tbl t
SET    trig_exec_count = trig_exec_count + 1
WHERE  t.id = NEW.id;

RETURN NULL;  -- RETURN value of AFTER trigger is ignored anyway

END;
$BODY$ LANGUAGE plpgsql;

2.) Flag row as updated.

CREATE OR REPLACE FUNCTION x.trg_upaft_counter_change_2()
    RETURNS trigger AS
$BODY$
BEGIN

UPDATE x.tbl
SET    updated = TRUE
WHERE  id = NEW.id;
RETURN NULL;

END;
$BODY$ LANGUAGE plpgsql;

3.) Reset "updated" flag.

CREATE OR REPLACE FUNCTION x.trg_upaft_counter_change_3()
    RETURNS trigger AS
$BODY$
BEGIN

UPDATE x.tbl
SET    updated = NULL
WHERE  id = NEW.id;
RETURN NULL;

END;
$BODY$ LANGUAGE plpgsql;

Trigger names are relevant! Called for the same event they are executed in alphabetical order.

1.) Payload, only if not "updated" yet:

CREATE CONSTRAINT TRIGGER upaft_counter_change_1
    AFTER UPDATE OF counter ON x.tbl
    DEFERRABLE INITIALLY DEFERRED
    FOR EACH ROW
    WHEN (NEW.updated IS NULL)
    EXECUTE PROCEDURE x.trg_upaft_counter_change_1();

2.) Flag row as updated, only if not "updated" yet:

CREATE TRIGGER upaft_counter_change_2   -- not deferred!
    AFTER UPDATE OF counter ON x.tbl
    FOR EACH ROW
    WHEN (NEW.updated IS NULL)
    EXECUTE PROCEDURE x.trg_upaft_counter_change_2();

3.) Reset Flag. No endless loop because of trigger condition.

CREATE CONSTRAINT TRIGGER upaft_counter_change_3
    AFTER UPDATE OF updated ON x.tbl
    DEFERRABLE INITIALLY DEFERRED
    FOR EACH ROW
    WHEN (NEW.updated)                 --
    EXECUTE PROCEDURE x.trg_upaft_counter_change_3();

Test

Run UPDATE & SELECT separately to see the deferred effect. If executed together (in one transaction) the SELECT will show the new tbl.counter but the old tbl2.trig_exec_count.

UPDATE x.tbl SET counter = counter + 1;

SELECT * FROM x.tbl;

Now, update the counter multiple times (in one transaction). The payload will only be executed once. Voilá!

UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;

SELECT * FROM x.tbl;
Sign up to request clarification or add additional context in comments.

1 Comment

A bit late to the party I know, but is it possible to execute the payload-function only once and last (as in only for the last UPDATE stmt)?
9

I don't know of a way to collapse trigger execution to once per (updated) row per transaction, but you can emulate this with a TEMPORARY ON COMMIT DROP table which tracks those modified rows and performs your expensive operation only once per row per tx:

CREATE OR REPLACE FUNCTION counter_change() RETURNS TRIGGER
AS $$
BEGIN
  -- If we're the first invocation of this trigger in this tx,
  -- make our scratch table.  Create unique index separately to
  -- suppress avoid NOTICEs without fiddling with log_min_messages
  BEGIN
    CREATE LOCAL TEMPORARY TABLE tbl_counter_tx_once
      ("id" AS_APPROPRIATE NOT NULL)
      ON COMMIT DROP;
    CREATE UNIQUE INDEX ON tbl_counter_tx_once AS ("id");
  EXCEPTION WHEN duplicate_table THEN
    NULL;
  END;

  -- If we're the first invocation in this tx *for this row*,
  -- then do our expensive operation.
  BEGIN
    INSERT INTO tbl_counter_tx_once ("id") VALUES (NEW."id");
    PERFORM SOME_EXPENSIVE_OPERATION_HERE(NEW."id");
  EXCEPTION WHEN unique_violation THEN
    NULL;
  END;

  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

There's of course a risk of name collision with that temporary table, so choose judiciously.

6 Comments

Exception handling is expensive and not needed. Consider CREATE TABLE IF NOT EXISTS (new in 9.1) and IF NOT EXISTS (SELECT ..) THEN ...; INSERT INTO tbl ..; END IF;. Also LOCALis just a noise word in PostgreSQL.
Re: non-exception based handling, yes, there's more than one way to do it. (Indeed my first tested solution was CREATE IF NOT EXISTS.) Re: LOCAL, yes, I know, and here I think it reinforces the purpose of using this table.
+1 Aside from that your solution deserves an upvote, too. Advanced stuff. You could include the current transaction ID with txid_current() in the name of the temporary table. Would force you to use dynamic SQL with EXECUTE, though. OR, better yet, add a column xid to the temp table with a stable name, then you can avoid the problem with static SQL!
This seems kind of heavy with the new tables, indexes and exception handling, currently I'm at 200 transactions a second and climbing. Also I need it to run after all the work in the transaction is complete, not after the first insert on that particular table. Good technique to know though, thanks.
Re: "seems kind of heavy," it may be, though I'd suggest profiling. Re: "I need it to run after ... the transaction," that's handled by the DEFERRABLEness of your trigger CONSTRAINT definition, not by the definition of the function to be executed.
|
1

This cannot be done ordinarily, you need some trick to do it.

For example, consider a balances(account_id, balance) table containing balances such that you don't want any balance to go negative at the end of a transaction, but it can go negative during a transaction due to eg. partial updates to the table.

If you do an ordinary balance >= 0 check, it cannot be deferred and will not work. If you create a deferred constraint trigger and check for new.balance >= 0, it will not work either, because the value for new is fixed at the time the trigger is scheduled, not when it is executed.

Hence, a potential solution is to actually query the table in the trigger function:

create function check_balance_trigger()
returns trigger language plpgsql as $$
begin
    -- This queries the table at the time the trigger is executed:
    select * from balances into new where account_id = new.account_id;
    if new.balance < 0 then
        raise 'Balance cannot be negative: %, %', new.account_id, new.balance;
    end if;
    return new;
end $$;

create constraint trigger check_balance
after insert or update on balances deferrable initially deferred
for each row execute function check_balance_trigger();

1 Comment

This works, but it has the drawback that it runs the check once for each time a balance was changed during the transaction. If you expect the number of such changes per transaction to be small, and if the work required to validate the constraint is cheap, then maybe this is not a big deal, but suppose the work isn't cheap. Suppose you want to fire off a NOTIFY to inform listeners of the new balance. Then you really do want to run the constraint validation function only once for each balance that changed at any time during the transaction, regardless of how many times it changed.
0

pilcrow's answer is good, but what if you want to avoid the overhead of executing a PL/pgSQL function FOR EACH ROW that is touched? You can't have a CONSTRAINT trigger that is also a FOR EACH STATEMENT trigger. A solution is to push the deferred constraint trigger down by one level…

CREATE FUNCTION defer_once_trigger()
    RETURNS trigger
    LANGUAGE plpgsql
AS $$
BEGIN
    BEGIN
        CREATE TEMPORARY TABLE deferred_once_trigger (
                "id" integer NOT NULL PRIMARY KEY
            )
            ON COMMIT DROP;
        CREATE CONSTRAINT TRIGGER deferred_once_trigger
            AFTER INSERT ON pg_temp.deferred_once_trigger
            DEFERRABLE INITIALLY DEFERRED
            FOR EACH ROW
            EXECUTE FUNCTION deferred_once_trigger();
    EXCEPTION
        WHEN duplicate_table THEN
            NULL;
    END;
    CASE TG_OP
        WHEN 'INSERT' THEN
            INSERT INTO pg_temp.deferred_once_trigger
                SELECT DISTINCT "id"
                    FROM new
                ON CONFLICT ("id") DO NOTHING;
        WHEN 'UPDATE' THEN
            INSERT INTO pg_temp.deferred_once_trigger
                SELECT "id"
                    FROM old
                UNION
                SELECT "id"
                    FROM new
                ON CONFLICT ("id") DO NOTHING;
        WHEN 'DELETE' THEN
            INSERT INTO pg_temp.deferred_once_trigger
                SELECT DISTINCT "id"
                    FROM old
                ON CONFLICT ("id") DO NOTHING;
    END CASE;
    RETURN NULL;
END;
$$;

CREATE TRIGGER defer_once_trigger_insert
    AFTER INSERT ON my_table
    REFERENCING NEW TABLE AS new
    FOR EACH STATEMENT
    EXECUTE FUNCTION defer_once_trigger();

CREATE TRIGGER defer_once_trigger_update
    AFTER UPDATE ON my_table
    REFERENCING OLD TABLE AS old
        NEW TABLE AS new
    FOR EACH STATEMENT
    EXECUTE FUNCTION defer_once_trigger();

CREATE TRIGGER defer_once_trigger_delete
    AFTER DELETE ON my_table
    REFERENCING OLD TABLE AS old
    FOR EACH STATEMENT
    EXECUTE FUNCTION defer_once_trigger();

The defer_once_trigger() function is called only once per DML statement affecting my_table rather than once per affected row. That could translate to sizable performance gains if you have statements that affect many rows multiple times in the same transaction.

As in pilcrow's answer, the defer_once_trigger() function creates a temporary table to track the affected rows by primary key. If the table creation succeeds, then it also adds a deferrable constraint trigger to the temporary table. In any case, the function then inserts the IDs of all the affected rows into the temporary table, skipping the ones that are already present. The server automatically schedules deferred calls to the deferred_once_trigger() function for each distinct ID that is inserted into the temporary table.

Note, because the temporary table is created in a connection-local temporary schema, there will never be any collisions with other concurrent transactions since each connection can have at most one open transaction at any given time. (The pg_temp schema name that is used to qualify the temporary table name is actually an alias that dynamically resolves to the server-assigned unique temporary schema name for the current connection.)

Although you can't use UPDATE OF … on a trigger that requests transition relations, you can perform equivalent filtering in the WHEN 'UPDATE' branch of CASE TG_OP. For example:

WHEN 'UPDATE' THEN
    INSERT INTO pg_temp.deferred_once_trigger
        SELECT DISTINCT "id"
            FROM old
                FULL JOIN new USING ("id")
            WHERE old.counter IS DISTINCT FROM new.counter
        ON CONFLICT ("id") DO NOTHING;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.