4

I'd like to have a SQL statement or function in Postgres 10 that allows me to insert a row only if there are less than a certain number of rows already.

select count(*) from mytable where mykey = 1
--- do the following only if the above is less than 5
insert into mytable (myvalue, mykey) values ('randomvalue', 1)
--- so there are never more than 5 rows in mytable with mykey = 1

Will something like this pseudocode work from the application (multiple roundtrip calls to the postgres server)?

tx = app.tx("start transaction");
count = tx.query("select count(*) from mytable where mykey = 1")
if (count < 5) {
  tx.query("insert into mytable (myvalue, mykey) values ('randomvalue', 1)")
}
tx.do("commit")

If it won't work, how could I do this on the Postgres side, so I only make one call from the application, like select myinsertfunction('randomvalue', 1)?

Whether it's the multiple roundtrip way of doing it above, or a single call to postgres, the most important thing is that there's no way that this can run in parallel in such a way that more than 5 rows are inserted. For example, transaction 1 and 2 both check that the count is 4 (less than the max of 5), and proceed at the same time, so both of them will end up inserting 1 row each leading to a total of 6 rows instead of the maximum of 5.

1

1 Answer 1

8

This problem is known as Phantom Read:

A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction.

Try

BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
INSERT INTO mytable (myvalue, mykey) SELECT 'randomvalue', 1 WHERE
    (SELECT COUNT(*) FROM mytable WHERE mykey = 1) < 5;
END;

The transaction isolation level will ensure that transactions will only insert values if the count is less than 5.

Sign up to request clarification or add additional context in comments.

6 Comments

Wouldn't setting the transaction level to repeatable read work too? Could you describe a scenario where it wouldn't work but serializable would?
No, that wouldn't work in general, because REPEATABLE READ (or lower) doesn't protect the whole table from modifications. And because you use COUNT to access the entire table, you must prevent other processes from changing it. But this doesn't matter: "When you select the level Read Uncommitted you really get Read Committed, and when you select Repeatable Read you really get Serializable" (see the linked docs above).
What are the costs of using serializable in this case (as opposed to the default read committed). For example would all other queries that read or write from mytable block until this transaction commits or rolls back?
No, not necessarily. Transaction serialization is very hard to explain in a comment, but there are different sources you could read, e.g.: Wikipedia or Postgres.
While this answer works, can someone explain to me why the OP's pseudo code doesn't work? In other words, why is checking the count in the application layer not OK, even if we use serializable transactions? I'm asking because I ran into this exact issue, where I tried checking the count in the application layer and it didn't work.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.