@@ -575,16 +575,21 @@ while holding AccessExclusiveLock on the relation.
575575
576576Due to all these constraints, complex changes (such as a multilevel index
577577insertion) normally need to be described by a series of atomic-action WAL
578- records. What do you do if the intermediate states are not self-consistent?
579- The answer is that the WAL replay logic has to be able to fix things up.
580- In btree indexes, for example, a page split requires insertion of a new key in
581- the parent btree level, but for locking reasons this has to be reflected by
582- two separate WAL records. The replay code has to remember "unfinished" split
583- operations, and match them up to subsequent insertions in the parent level.
584- If no matching insert has been found by the time the WAL replay ends, the
585- replay code has to do the insertion on its own to restore the index to
586- consistency. Such insertions occur after WAL is operational, so they can
587- and should write WAL records for the additional generated actions.
578+ records. The intermediate states must be self-consistent, so that if the
579+ replay is interrupted between any two actions, the system is fully
580+ functional. In btree indexes, for example, a page split requires a new page
581+ to be allocated, and an insertion of a new key in the parent btree level,
582+ but for locking reasons this has to be reflected by two separate WAL
583+ records. Replaying the first record, to allocate the new page and move
584+ tuples to it, sets a flag on the page to indicate that the key has not been
585+ inserted to the parent yet. Replaying the second record clears the flag.
586+ This intermediate state is never seen by other backends during normal
587+ operation, because the lock on the child page is held across the two
588+ actions, but will be seen if the operation is interrupted before writing
589+ the second WAL record. The search algorithm works with the intermediate
590+ state as normal, but if an insertion encounters a page with the
591+ incomplete-split flag set, it will finish the interrupted split by
592+ inserting the key to the parent, before proceeding.
588593
589594Writing Hints
590595-------------
0 commit comments