@@ -60,20 +60,29 @@ identical lock mode sets. See src/tools/backend/index.html and
6060src/include/storage/lock.h for more details. (Lock modes are also called
6161lock types in some places in the code and documentation.)
6262
63- There are two fundamental lock structures in shared memory: the
64- per-lockable-object LOCK struct, and the per-lock-and-requestor PROCLOCK
65- struct. A LOCK object exists for each lockable object that currently has
66- locks held or requested on it. A PROCLOCK struct exists for each backend
67- that is holding or requesting lock(s) on each LOCK object.
68-
69- In addition to these, each backend maintains an unshared LOCALLOCK structure
70- for each lockable object and lock mode that it is currently holding or
71- requesting. The shared lock structures only allow a single lock grant to
72- be made per lockable object/lock mode/backend. Internally to a backend,
73- however, the same lock may be requested and perhaps released multiple times
74- in a transaction, and it can also be held both transactionally and session-
75- wide. The internal request counts are held in LOCALLOCK so that the shared
76- data structures need not be accessed to alter them.
63+ There are two main methods for recording locks in shared memory. The primary
64+ mechanism uses two main structures: the per-lockable-object LOCK struct, and
65+ the per-lock-and-requestor PROCLOCK struct. A LOCK object exists for each
66+ lockable object that currently has locks held or requested on it. A PROCLOCK
67+ struct exists for each backend that is holding or requesting lock(s) on each
68+ LOCK object.
69+
70+ There is also a special "fast path" mechanism which backends may use to
71+ record a limited number of locks with very specific characteristics: they must
72+ use the DEFAULT lockmethod; they must represent a lock on a database relation
73+ (not a shared relation), they must be a "weak" lock which is unlikely to
74+ conflict (AccessShareLock, RowShareLock, or RowExclusiveLock); and the system
75+ must be able to quickly verify that no conflicting locks could possibly be
76+ present. See "Fast Path Locking", below, for more details.
77+
78+ Each backend also maintains an unshared LOCALLOCK structure for each lockable
79+ object and lock mode that it is currently holding or requesting. The shared
80+ lock structures only allow a single lock grant to be made per lockable
81+ object/lock mode/backend. Internally to a backend, however, the same lock may
82+ be requested and perhaps released multiple times in a transaction, and it can
83+ also be held both transactionally and session-wide. The internal request
84+ counts are held in LOCALLOCK so that the shared data structures need not be
85+ accessed to alter them.
7786
7887---------------------------------------------------------------------------
7988
@@ -250,6 +259,65 @@ tradeoff: we could instead recalculate the partition number from the LOCKTAG
250259when needed.
251260
252261
262+ Fast Path Locking
263+ -----------------
264+
265+ Fast path locking is a special purpose mechanism designed to reduce the
266+ overhead of taking and releasing weak relation locks. SELECT, INSERT,
267+ UPDATE, and DELETE must acquire a lock on every relation they operate on,
268+ as well as various system catalogs that can be used internally. These locks
269+ are notable not only for the very high frequency with which they are taken
270+ and released, but also for the fact that they virtually never conflict.
271+ Many DML operations can proceed in parallel against the same table at the
272+ same time; only DDL operations such as CLUSTER, ALTER TABLE, or DROP -- or
273+ explicit user action such as LOCK TABLE -- will create lock conflicts with
274+ the "weak" locks (AccessShareLock, RowShareLock, RowExclusiveLock) acquired
275+ by DML operations.
276+
277+ The primary locking mechanism does not cope well with this workload. Even
278+ though the lock manager locks are partitioned, the locktag for any given
279+ relation still falls in one, and only one, partition. Thus, if many short
280+ queries are accessing the same relation, the lock manager partition lock for
281+ that partition becomes a contention bottleneck. This effect is measurable
282+ even on 2-core servers, and becomes very pronounced as core count increases.
283+
284+ To alleviate this bottleneck, beginning in PostgreSQL 9.2, each backend is
285+ permitted to record a limited number of locks on unshared relations in an
286+ array within its PGPROC structure, rather than using the primary lock table.
287+ This is called the "fast path" mechanism, and can only be used when the
288+ locker can verify that no conflicting locks can possibly exist.
289+
290+ A key point of this algorithm is that it must be possible to verify the
291+ absence of possibly conflicting locks without fighting over a shared LWLock or
292+ spinlock. Otherwise, this effort would simply move the contention bottleneck
293+ from one place to another. We accomplish this using an array of 1024 integer
294+ counters, which are in effect a 1024-way partitioning of the lock space. Each
295+ counter records the number of "strong" locks (that is, ShareLock,
296+ ShareRowExclusiveLock, ExclusiveLock, and AccessExclusiveLock) on unshared
297+ relations that fall into that partition. When this counter is non-zero, the
298+ fast path mechanism may not be used for relation locks in that partition. A
299+ strong locker bumps the counter and then scans each per-backend array for
300+ matching fast-path locks; any which are found must be transferred to the
301+ primary lock table before attempting to acquire the lock, to ensure proper
302+ lock conflict and deadlock detection.
303+
304+ On an SMP system, we must guarantee proper memory synchronization. Here we
305+ rely on the fact that LWLock acquisition acts as a memory sequence point: if
306+ A performs a store, A and B both acquire an LWLock in either order, and B
307+ then performs a load on the same memory location, it is guaranteed to see
308+ A's store. In this case, each backend's fast-path lock queue is protected
309+ by an LWLock. A backend wishing to acquire a fast-path lock grabs this
310+ LWLock before examining FastPathStrongLocks to check for the presence of a
311+ conflicting strong lock. And the backend attempting to acquire a strong
312+ lock, because it must transfer any matching weak locks taken via the fast-path
313+ mechanism to the shared lock table, will acquire every LWLock protecting
314+ a backend fast-path queue in turn. Thus, if we examine FastPathStrongLocks
315+ and see a zero, then either the value is truly zero, or if it is a stale value,
316+ the strong locker has yet to acquire the per-backend LWLock we now hold (or,
317+ indeed, even the first per-backend LWLock) and will notice any weak lock we
318+ take when it does.
319+
320+
253321The Deadlock Detection Algorithm
254322--------------------------------
255323
0 commit comments