Locking table in postgresql

Question

I have a table named as 'games', which contains a column named as 'title', this column is unique, database used in PostgreSQL

I have a user input form that allows him to insert a new 'game' in 'games' table. The function that insert a new game checks if a previously entered 'game' with the same 'title' already exists, for this, I get the count of rows, with the same game 'title'.

I use transactions for this, the insert function at the start uses BEGIN, gets the row count, if row count is 0, inserts the new row and after process is completed, it COMMITS the changes.

The problem is that, there are chances that 2 games with the same title if submitted by the user at the same time, would be inserted twice, since I just get the count of rows to chk for duplicate records, and each of the transaction would be isolated from each other

I thought of locking the tables when getting the row count as:

LOCK TABLE games IN ACCESS EXCLUSIVE MODE;
SELECT count(id) FROM games WHERE games.title = 'new_game_title'

Which would lock the table for reading too (which means the other transaction would have to wait, until the current one is completed successfully). This would solve the problem, which is what I suspect. Is there a better way around this (avoiding duplicate games with the same title)

Why don't you use a unique constraint instead of trying to fight the race conditions yourself? — mu is too short
– mu is too short, Commented Jan 6, 2013 at 5:56
@muistooshort I could do that, but it would produce a error at user end — Akash
– Akash, Commented Jan 6, 2013 at 5:57
Then trap the error yourself. You're trying to avoid a simple bit of error handling using a fragile pile of kludges, save yourself some trouble and let the database manage the data and its constraints. — mu is too short
– mu is too short, Commented Jan 6, 2013 at 6:15
You have to trap errors anyway. There are a lot of things besides a constraint violation that can make an INSERT fail: memory error, connectivity problem, permissions, etc. Trap this one, too. — Mike Sherrill 'Cat Recall'
– Mike Sherrill 'Cat Recall', Commented Jan 6, 2013 at 7:39

mvp · Accepted Answer · 2013-01-06 09:06:48Z

5

You should NOT need to lock your tables in this situation.

Instead, you can use one of the following approaches:

Define UNIQUE index for column that really must be unique. In this case, first transaction will succeed, and second will error out.
Define AFTER INSERT OR UPDATE OR DELETE trigger that will check your condition, and if it does not hold, it should RAISE error, which will abort offending transaction

In all these cases, your client code should be ready to properly handle possible failures (like failed transactions) that could be returned by executing your statements.

answered Jan 6, 2013 at 9:06

mvp

118k15 gold badges132 silver badges155 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

user330315 Over a year ago

+1, a unique constraint is the only sensible way to go. (I'm not really fond of the trigger solution though).

Akash Over a year ago

how about using somethinkg like SELECT count(id) FROM games WHERE games.title = 'new_game_title' FOR UPDATE

Frank Heikens Over a year ago

@Akash: You can't count nor lock for update, something that is not yet committed by the other process. A unique constraint is the only safe solution, as mentioned by the others. A unique constraint is made for this problem, so use it.

user330315 Over a year ago

@Akash: don't use explicit locking. Use a unique constraint and catch the error. that way your application will be much more scalable and use much less resources on the DB server. As others have pointed out you have to implement error handling anyway.

Frank Heikens Over a year ago

"I plan to use FOR UPDATE along with UNIQUE constraint" Why? It makes your application slow and the FOR UPDATE doesn't add anything at all. "its next to impossible catching the right error" Why? Just read (catch) the error message and you're done. Very simple to implement.

|

Philip Couling · Accepted Answer · 2018-01-04 00:26:06Z

3

Using the highest transaction isolation(Serializable) you can achieve something similar to your actual question. But be aware that this may fail ERROR: could not serialize access due to concurrent update

I do not agree with the constraint approach entirely. You should have a constraint to protect data integrity, but relying on the constraint forces you to identify not only what error occurred, but which constraint caused the error. The trouble is not catching the error as some have discussed but identifying what caused the error and providing a human readable reason for the failure. Depending on which language your application is written in, this can be next to impossible. eg: telling the user "Game title [foo] already exists" instead of "game must have a price" for a separate constraint.

There is a single statement alternative to your two stage approach:

INSERT INTO games ( [column1], ... )
SELECT [value1], ...
WHERE NOT EXISTS ( SELECT x FROM games as g2 WHERE games.title = g2.title );

I want to be clear with this... this is not an alternative to having a unique constraint (which requires extra data for the index). You must have one to protect your data from corruption.

edited Jan 4, 2018 at 0:26

answered Jan 6, 2013 at 12:32

Philip Couling

15.2k8 gold badges73 silver badges104 bronze badges

5 Comments

Myles McDonnell Over a year ago

I think this is erroneous; it would allow multiple inserts of a single game title due to a race condition because the INSERT happens when the inner SELECT finds zero rows therefore no lock is taken.

Myles McDonnell Over a year ago

I understand that having a unique constraint will prevent multi same title inserts altogether, but the INSERT INTO..SELECT WHERE NOT EXISTS solution put forward has a race condition; it's is possible that it will insert a duplicate title and the caller will be none the wiser, so I fail to see the value of it? You say 'it will reliably fail to insert the row regardless of constraints' but that is not true as two concurrent inserts of the same title may both succeed.

Philip Couling Over a year ago

It's possible I'd been doing far too much Oracle coding when I originally wrote this. But I note that descriptions of "Read Committed Isolation Level" changed in the manual between 8.3 and 8.4. Without digging out an archaic version of postgres I couldn't tell if the advice was always wrong or just out of date. I've modified my answer to correct.

Myles McDonnell Over a year ago

I'm still not sure it's correct, but you may well know more than me on the subject. I find these things very hard to reason about just looking at SQL and not as you rightly point out referring to the docs but my understanding is that because the inner select doesn't in fact select anything then no lock is taken regardless of the isolation level and therefore this is not safe in a concurrent scenario. I am 99% certain that is the case on PG 9.6.3 as I have been working on exactly this today on that version.

Philip Couling Over a year ago

There's no solution which will never error and just wait. I'm a bit disappointed with postgres for this. Other DBMS (Oracle and Mysql) will execute the above SQL atomically forcing other queries to wait. Other DBMS run the risk of deadlock so I guess it's a case of picking your poison.

Collectives™ on Stack Overflow

Locking table in postgresql

2 Answers 2

8 Comments

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related