@@ -597,21 +597,22 @@ deadlock detection algorithm very much, but it makes the bookkeeping more
597597complicated.
598598
599599We choose to regard locks held by processes in the same parallel group as
600- non-conflicting. This means that two processes in a parallel group can hold a
601- self-exclusive lock on the same relation at the same time, or one process can
602- acquire an AccessShareLock while the other already holds AccessExclusiveLock.
603- This might seem dangerous and could be in some cases (more on that below), but
604- if we didn't do this then parallel query would be extremely prone to
605- self-deadlock. For example, a parallel query against a relation on which the
606- leader already had AccessExclusiveLock would hang, because the workers would
607- try to lock the same relation and be blocked by the leader; yet the leader
608- can't finish until it receives completion indications from all workers. An
609- undetected deadlock results. This is far from the only scenario where such a
610- problem happens. The same thing will occur if the leader holds only
611- AccessShareLock, the worker seeks AccessShareLock, but between the time the
612- leader attempts to acquire the lock and the time the worker attempts to
613- acquire it, some other process queues up waiting for an AccessExclusiveLock.
614- In this case, too, an indefinite hang results.
600+ non-conflicting with the exception of relation extension and page locks. This
601+ means that two processes in a parallel group can hold a self-exclusive lock on
602+ the same relation at the same time, or one process can acquire an AccessShareLock
603+ while the other already holds AccessExclusiveLock. This might seem dangerous and
604+ could be in some cases (more on that below), but if we didn't do this then
605+ parallel query would be extremely prone to self-deadlock. For example, a
606+ parallel query against a relation on which the leader already had
607+ AccessExclusiveLock would hang, because the workers would try to lock the same
608+ relation and be blocked by the leader; yet the leader can't finish until it
609+ receives completion indications from all workers. An undetected deadlock
610+ results. This is far from the only scenario where such a problem happens. The
611+ same thing will occur if the leader holds only AccessShareLock, the worker
612+ seeks AccessShareLock, but between the time the leader attempts to acquire the
613+ lock and the time the worker attempts to acquire it, some other process queues
614+ up waiting for an AccessExclusiveLock. In this case, too, an indefinite hang
615+ results.
615616
616617It might seem that we could predict which locks the workers will attempt to
617618acquire and ensure before going parallel that those locks would be acquired
@@ -637,18 +638,23 @@ the other is safe enough. Problems would occur if the leader initiated
637638parallelism from a point in the code at which it had some backend-private
638639state that made table access from another process unsafe, for example after
639640calling SetReindexProcessing and before calling ResetReindexProcessing,
640- catastrophe could ensue, because the worker won't have that state. Similarly,
641- problems could occur with certain kinds of non-relation locks, such as
642- relation extension locks. It's no safer for two related processes to extend
643- the same relation at the time than for unrelated processes to do the same.
644- However, since parallel mode is strictly read-only at present, neither this
645- nor most of the similar cases can arise at present. To allow parallel writes,
646- we'll either need to (1) further enhance the deadlock detector to handle those
647- types of locks in a different way than other types; or (2) have parallel
648- workers use some other mutual exclusion method for such cases; or (3) revise
649- those cases so that they no longer use heavyweight locking in the first place
650- (which is not a crazy idea, given that such lock acquisitions are not expected
651- to deadlock and that heavyweight lock acquisition is fairly slow anyway).
641+ catastrophe could ensue, because the worker won't have that state.
642+
643+ To allow parallel inserts and parallel copy, we have ensured that relation
644+ extension and page locks don't participate in group locking which means such
645+ locks can conflict among the same group members. This is required as it is no
646+ safer for two related processes to extend the same relation or perform clean up
647+ in gin indexes at a time than for unrelated processes to do the same. We don't
648+ acquire a heavyweight lock on any other object after relation extension lock
649+ which means such a lock can never participate in the deadlock cycle. After
650+ acquiring page locks, we can acquire relation extension lock but reverse never
651+ happens, so those will also not participate in deadlock. To allow for other
652+ parallel writes like parallel update or parallel delete, we'll either need to
653+ (1) further enhance the deadlock detector to handle those tuple locks in a
654+ different way than other types; or (2) have parallel workers use some other
655+ mutual exclusion method for such cases. Currently, the parallel mode is
656+ strictly read-only, but now we have the infrastructure to allow parallel
657+ inserts and parallel copy.
652658
653659Group locking adds three new members to each PGPROC: lockGroupLeader,
654660lockGroupMembers, and lockGroupLink. A PGPROC's lockGroupLeader is NULL for
0 commit comments