@@ -260,10 +260,23 @@ heapgetpage(HeapScanDesc scan, BlockNumber page)
260260
261261 /*
262262 * If the all-visible flag indicates that all tuples on the page are
263- * visible to everyone, we can skip the per-tuple visibility tests. But
264- * not in hot standby mode. A tuple that's already visible to all
263+ * visible to everyone, we can skip the per-tuple visibility tests.
264+ *
265+ * Note: In hot standby, a tuple that's already visible to all
265266 * transactions in the master might still be invisible to a read-only
266- * transaction in the standby.
267+ * transaction in the standby. We partly handle this problem by tracking
268+ * the minimum xmin of visible tuples as the cut-off XID while marking a
269+ * page all-visible on master and WAL log that along with the visibility
270+ * map SET operation. In hot standby, we wait for (or abort) all
271+ * transactions that can potentially may not see one or more tuples on the
272+ * page. That's how index-only scans work fine in hot standby. A crucial
273+ * difference between index-only scans and heap scans is that the
274+ * index-only scan completely relies on the visibility map where as heap
275+ * scan looks at the page-level PD_ALL_VISIBLE flag. We are not sure if the
276+ * page-level flag can be trusted in the same way, because it might get
277+ * propagated somehow without being explicitly WAL-logged, e.g. via a full
278+ * page write. Until we can prove that beyond doubt, let's check each
279+ * tuple for visibility the hard way.
267280 */
268281 all_visible = PageIsAllVisible (dp ) && !snapshot -> takenDuringRecovery ;
269282
0 commit comments