Commit faeedbc
committed
Introduce PG_IO_ALIGN_SIZE and align all I/O buffers.
In order to have the option to use O_DIRECT/FILE_FLAG_NO_BUFFERING in a
later commit, we need the addresses of user space buffers to be well
aligned. The exact requirements vary by OS and file system (typically
sectors and/or memory pages). The address alignment size is set to
4096, which is enough for currently known systems: it matches modern
sectors and common memory page size. There is no standard governing
O_DIRECT's requirements so we might eventually have to reconsider this
with more information from the field or future systems.
Aligning I/O buffers on memory pages is also known to improve regular
buffered I/O performance.
Three classes of I/O buffers for regular data pages are adjusted:
(1) Heap buffers are now allocated with the new palloc_aligned() or
MemoryContextAllocAligned() functions introduced by commit 439f617.
(2) Stack buffers now use a new struct PGIOAlignedBlock to respect
PG_IO_ALIGN_SIZE, if possible with this compiler. (3) The buffer
pool is also aligned in shared memory.
WAL buffers were already aligned on XLOG_BLCKSZ. It's possible for
XLOG_BLCKSZ to be configured smaller than PG_IO_ALIGNED_SIZE and thus
for O_DIRECT WAL writes to fail to be well aligned, but that's a
pre-existing condition and will be addressed by a later commit.
BufFiles are not yet addressed (there's no current plan to use O_DIRECT
for those, but they could potentially get some incidental speedup even
in plain buffered I/O operations through better alignment).
If we can't align stack objects suitably using the compiler extensions
we know about, we disable the use of O_DIRECT by setting PG_O_DIRECT to
0. This avoids the need to consider systems that have O_DIRECT but
can't align stack objects the way we want; such systems could in theory
be supported with more work but we don't currently know of any such
machines, so it's easier to pretend there is no O_DIRECT support
instead. That's an existing and tested class of system.
Add assertions that all buffers passed into smgrread(), smgrwrite() and
smgrextend() are correctly aligned, unless PG_O_DIRECT is 0 (= stack
alignment tricks may be unavailable) or the block size has been set too
small to allow arrays of buffers to be all aligned.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CA+hUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg@mail.gmail.com1 parent d73c285 commit faeedbc
File tree
26 files changed
+108
-45
lines changed- contrib
- bloom
- pg_prewarm
- src
- backend
- access
- gist
- hash
- heap
- nbtree
- spgist
- transam
- catalog
- storage
- buffer
- file
- page
- smgr
- utils/sort
- bin
- pg_checksums
- pg_rewind
- pg_upgrade
- common
- include
- storage
- tools/pgindent
26 files changed
+108
-45
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
166 | 166 | | |
167 | 167 | | |
168 | 168 | | |
169 | | - | |
| 169 | + | |
170 | 170 | | |
171 | 171 | | |
172 | 172 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | | - | |
| 39 | + | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
415 | 415 | | |
416 | 416 | | |
417 | 417 | | |
418 | | - | |
| 418 | + | |
419 | 419 | | |
420 | 420 | | |
421 | 421 | | |
| |||
509 | 509 | | |
510 | 510 | | |
511 | 511 | | |
512 | | - | |
| 512 | + | |
| 513 | + | |
513 | 514 | | |
514 | 515 | | |
515 | 516 | | |
| |||
579 | 580 | | |
580 | 581 | | |
581 | 582 | | |
582 | | - | |
| 583 | + | |
583 | 584 | | |
584 | 585 | | |
585 | 586 | | |
| |||
630 | 631 | | |
631 | 632 | | |
632 | 633 | | |
633 | | - | |
| 634 | + | |
634 | 635 | | |
635 | 636 | | |
636 | 637 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
992 | 992 | | |
993 | 993 | | |
994 | 994 | | |
995 | | - | |
| 995 | + | |
996 | 996 | | |
997 | 997 | | |
998 | 998 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
255 | 255 | | |
256 | 256 | | |
257 | 257 | | |
258 | | - | |
| 258 | + | |
259 | 259 | | |
260 | 260 | | |
261 | 261 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
157 | | - | |
| 157 | + | |
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
619 | 619 | | |
620 | 620 | | |
621 | 621 | | |
622 | | - | |
| 622 | + | |
623 | 623 | | |
624 | 624 | | |
625 | 625 | | |
| |||
660 | 660 | | |
661 | 661 | | |
662 | 662 | | |
663 | | - | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
664 | 666 | | |
665 | 667 | | |
666 | 668 | | |
| |||
1170 | 1172 | | |
1171 | 1173 | | |
1172 | 1174 | | |
1173 | | - | |
| 1175 | + | |
1174 | 1176 | | |
1175 | 1177 | | |
1176 | 1178 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
161 | | - | |
| 161 | + | |
162 | 162 | | |
163 | 163 | | |
164 | 164 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
62 | 65 | | |
63 | 66 | | |
| 67 | + | |
| 68 | + | |
64 | 69 | | |
65 | 70 | | |
66 | 71 | | |
67 | | - | |
68 | | - | |
69 | 72 | | |
70 | 73 | | |
71 | 74 | | |
| |||
269 | 272 | | |
270 | 273 | | |
271 | 274 | | |
272 | | - | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
273 | 278 | | |
274 | 279 | | |
275 | 280 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4506 | 4506 | | |
4507 | 4507 | | |
4508 | 4508 | | |
4509 | | - | |
| 4509 | + | |
4510 | 4510 | | |
4511 | 4511 | | |
4512 | 4512 | | |
| |||
0 commit comments