|
27 | 27 | </para> |
28 | 28 |
|
29 | 29 | <para> |
30 | | - While forcing data periodically to the disk platters might seem like |
| 30 | + While forcing data to the disk platters periodically might seem like |
31 | 31 | a simple operation, it is not. Because disk drives are dramatically |
32 | 32 | slower than main memory and CPUs, several layers of caching exist |
33 | 33 | between the computer's main memory and the disk platters. |
|
48 | 48 | some later time. Such caches can be a reliability hazard because the |
49 | 49 | memory in the disk controller cache is volatile, and will lose its |
50 | 50 | contents in a power failure. Better controller cards have |
51 | | - <firstterm>battery-backed unit</> (<acronym>BBU</>) caches, meaning |
| 51 | + <firstterm>battery-backup units</> (<acronym>BBU</>s), meaning |
52 | 52 | the card has a battery that |
53 | 53 | maintains power to the cache in case of system power loss. After power |
54 | 54 | is restored the data will be written to the disk drives. |
|
57 | 57 | <para> |
58 | 58 | And finally, most disk drives have caches. Some are write-through |
59 | 59 | while some are write-back, and the same concerns about data loss |
60 | | - exist for write-back drive caches as exist for disk controller |
| 60 | + exist for write-back drive caches as for disk controller |
61 | 61 | caches. Consumer-grade IDE and SATA drives are particularly likely |
62 | | - to have write-back caches that will not survive a power failure, |
63 | | - though <acronym>ATAPI-6</> introduced a drive cache flush command |
64 | | - (<command>FLUSH CACHE EXT</>) that some file systems use, e.g. |
65 | | - <acronym>ZFS</>, <acronym>ext4</>. (The SCSI command |
66 | | - <command>SYNCHRONIZE CACHE</> has long been available.) Many |
67 | | - solid-state drives (SSD) also have volatile write-back caches, and |
68 | | - many do not honor cache flush commands by default. |
| 62 | + to have write-back caches that will not survive a power failure. Many |
| 63 | + solid-state drives (SSD) also have volatile write-back caches. |
69 | 64 | </para> |
70 | 65 |
|
71 | 66 | <para> |
|
81 | 76 | a <literal>*</> next to <literal>Write cache</>. <command>hdparm -W</> |
82 | 77 | can be used to turn off write caching. SCSI drives can be queried |
83 | 78 | using <ulink url="http://sg.danny.cz/sg/sdparm.html"><application>sdparm</></ulink>. |
84 | | - for SCSI drives. Use <command>sdparm --get=WCE</command> to check |
| 79 | + Use <command>sdparm --get=WCE</command> to check |
85 | 80 | whether the write cache is enabled and <command>sdparm --clear=WCE</> |
86 | 81 | to disable it. |
87 | 82 | </para> |
|
107 | 102 | <listitem> |
108 | 103 | <para> |
109 | 104 | On <productname>Windows</>, if <varname>wal_sync_method</> is |
110 | | - <literal>open_datasync</> (the default), write caching is disabled |
111 | | - by unchecking <literal>My Computer\Open\{select disk drive}\Properties\Hardware\Properties\Policies\Enable write caching on the disk</>. |
112 | | - Alternatively, set <varname>wal_sync_method</varname> to <literal>fsync</> or <literal>fsync_writethrough</>, which never do write caching. |
| 105 | + <literal>open_datasync</> (the default), write caching can be disabled |
| 106 | + by unchecking <literal>My Computer\Open\<replaceable>disk drive</>\Properties\Hardware\Properties\Policies\Enable write caching on the disk</>. |
| 107 | + Alternatively, set <varname>wal_sync_method</varname> to |
| 108 | + <literal>fsync</> or <literal>fsync_writethrough</>, which prevent |
| 109 | + write caching. |
113 | 110 | </para> |
114 | 111 | </listitem> |
115 | 112 |
|
116 | 113 | <listitem> |
117 | 114 | <para> |
118 | | - On <productname>MacOS X</productname>, write caching can be disabled by |
| 115 | + On <productname>Mac OS X</productname>, write caching can be prevented by |
119 | 116 | setting <varname>wal_sync_method</> to <literal>fsync_writethrough</>. |
120 | 117 | </para> |
121 | 118 | </listitem> |
122 | 119 | </itemizedlist> |
123 | 120 |
|
124 | 121 | <para> |
125 | | - Many file systems that use write barriers (e.g. <acronym>ZFS</>, |
126 | | - <acronym>ext4</>) internally use <command>FLUSH CACHE EXT</> or |
127 | | - <command>SYNCHRONIZE CACHE</> commands to flush data to the platters on |
128 | | - write-back-enabled drives. Unfortunately, such write barrier file |
129 | | - systems behave suboptimally when combined with battery-backed unit |
| 122 | + Recent SATA drives (those following <acronym>ATAPI-6</> or later) |
| 123 | + offer a drive cache flush command (<command>FLUSH CACHE EXT</>), |
| 124 | + while SCSI drives have long supported a similar command |
| 125 | + <command>SYNCHRONIZE CACHE</>. These commands are not directly |
| 126 | + accessible to <productname>PostgreSQL</>, but some file systems |
| 127 | + (e.g., <acronym>ZFS</>, <acronym>ext4</>) can use them to flush |
| 128 | + data to the platters on write-back-enabled drives. Unfortunately, such |
| 129 | + file systems behave suboptimally when combined with battery-backup unit |
130 | 130 | (<acronym>BBU</>) disk controllers. In such setups, the synchronize |
131 | | - command forces all data from the BBU to the disks, eliminating much |
132 | | - of the benefit of the BBU. You can run the utility |
| 131 | + command forces all data from the controller cache to the disks, |
| 132 | + eliminating much of the benefit of the BBU. You can run the utility |
133 | 133 | <filename>src/tools/fsync</> in the PostgreSQL source tree to see |
134 | 134 | if you are affected. If you are affected, the performance benefits |
135 | | - of the BBU cache can be regained by turning off write barriers in |
| 135 | + of the BBU can be regained by turning off write barriers in |
136 | 136 | the file system or reconfiguring the disk controller, if that is |
137 | 137 | an option. If write barriers are turned off, make sure the battery |
138 | | - remains active; a faulty battery can potentially lead to data loss. |
| 138 | + remains functional; a faulty battery can potentially lead to data loss. |
139 | 139 | Hopefully file system and disk controller designers will eventually |
140 | 140 | address this suboptimal behavior. |
141 | 141 | </para> |
|
148 | 148 | ensure data integrity. Avoid disk controllers that have non-battery-backed |
149 | 149 | write caches. At the drive level, disable write-back caching if the |
150 | 150 | drive cannot guarantee the data will be written before shutdown. |
| 151 | + If you use SSDs, be aware that many of these do not honor cache flush |
| 152 | + commands by default. |
151 | 153 | You can test for reliable I/O subsystem behavior using <ulink |
152 | 154 | url="http://brad.livejournal.com/2116715.html"><filename>diskchecker.pl</filename></ulink>. |
153 | 155 | </para> |
|
157 | 159 | operations themselves. Disk platters are divided into sectors, |
158 | 160 | commonly 512 bytes each. Every physical read or write operation |
159 | 161 | processes a whole sector. |
160 | | - When a write request arrives at the drive, it might be for 512 bytes, |
161 | | - 1024 bytes, or 8192 bytes, and the process of writing could fail due |
| 162 | + When a write request arrives at the drive, it might be for some multiple |
| 163 | + of 512 bytes (<productname>PostgreSQL</> typically writes 8192 bytes, or |
| 164 | + 16 sectors, at a time), and the process of writing could fail due |
162 | 165 | to power loss at any time, meaning some of the 512-byte sectors were |
163 | | - written, and others were not. To guard against such failures, |
| 166 | + written while others were not. To guard against such failures, |
164 | 167 | <productname>PostgreSQL</> periodically writes full page images to |
165 | 168 | permanent WAL storage <emphasis>before</> modifying the actual page on |
166 | 169 | disk. By doing this, during crash recovery <productname>PostgreSQL</> can |
167 | | - restore partially-written pages. If you have a battery-backed disk |
| 170 | + restore partially-written pages from WAL. If you have a battery-backed disk |
168 | 171 | controller or file-system software that prevents partial page writes |
169 | | - (e.g., ZFS), you can turn off this page imaging by turning off the |
| 172 | + (e.g., ZFS), you can safely turn off this page imaging by turning off the |
170 | 173 | <xref linkend="guc-full-page-writes"> parameter. |
171 | 174 | </para> |
172 | 175 | </sect1> |
|
0 commit comments