|
297 | 297 | transaction processing. Briefly, <acronym>WAL</acronym>'s central |
298 | 298 | concept is that changes to data files (where tables and indexes |
299 | 299 | reside) must be written only after those changes have been logged, |
300 | | - that is, after log records describing the changes have been flushed |
| 300 | + that is, after WAL records describing the changes have been flushed |
301 | 301 | to permanent storage. If we follow this procedure, we do not need |
302 | 302 | to flush data pages to disk on every transaction commit, because we |
303 | 303 | know that in the event of a crash we will be able to recover the |
304 | 304 | database using the log: any changes that have not been applied to |
305 | | - the data pages can be redone from the log records. (This is |
| 305 | + the data pages can be redone from the WAL records. (This is |
306 | 306 | roll-forward recovery, also known as REDO.) |
307 | 307 | </para> |
308 | 308 |
|
|
323 | 323 |
|
324 | 324 | <para> |
325 | 325 | Using <acronym>WAL</acronym> results in a |
326 | | - significantly reduced number of disk writes, because only the log |
| 326 | + significantly reduced number of disk writes, because only the WAL |
327 | 327 | file needs to be flushed to disk to guarantee that a transaction is |
328 | 328 | committed, rather than every data file changed by the transaction. |
329 | | - The log file is written sequentially, |
330 | | - and so the cost of syncing the log is much less than the cost of |
| 329 | + The WAL file is written sequentially, |
| 330 | + and so the cost of syncing the WAL is much less than the cost of |
331 | 331 | flushing the data pages. This is especially true for servers |
332 | 332 | handling many small transactions touching different parts of the data |
333 | 333 | store. Furthermore, when the server is processing many small concurrent |
334 | | - transactions, one <function>fsync</function> of the log file may |
| 334 | + transactions, one <function>fsync</function> of the WAL file may |
335 | 335 | suffice to commit many transactions. |
336 | 336 | </para> |
337 | 337 |
|
|
341 | 341 | linkend="continuous-archiving"/>. By archiving the WAL data we can support |
342 | 342 | reverting to any time instant covered by the available WAL data: |
343 | 343 | we simply install a prior physical backup of the database, and |
344 | | - replay the WAL log just as far as the desired time. What's more, |
| 344 | + replay the WAL just as far as the desired time. What's more, |
345 | 345 | the physical backup doesn't have to be an instantaneous snapshot |
346 | 346 | of the database state — if it is made over some period of time, |
347 | | - then replaying the WAL log for that period will fix any internal |
| 347 | + then replaying the WAL for that period will fix any internal |
348 | 348 | inconsistencies. |
349 | 349 | </para> |
350 | 350 | </sect1> |
|
497 | 497 | that the heap and index data files have been updated with all |
498 | 498 | information written before that checkpoint. At checkpoint time, all |
499 | 499 | dirty data pages are flushed to disk and a special checkpoint record is |
500 | | - written to the log file. (The change records were previously flushed |
| 500 | + written to the WAL file. (The change records were previously flushed |
501 | 501 | to the <acronym>WAL</acronym> files.) |
502 | 502 | In the event of a crash, the crash recovery procedure looks at the latest |
503 | | - checkpoint record to determine the point in the log (known as the redo |
| 503 | + checkpoint record to determine the point in the WAL (known as the redo |
504 | 504 | record) from which it should start the REDO operation. Any changes made to |
505 | 505 | data files before that point are guaranteed to be already on disk. |
506 | | - Hence, after a checkpoint, log segments preceding the one containing |
| 506 | + Hence, after a checkpoint, WAL segments preceding the one containing |
507 | 507 | the redo record are no longer needed and can be recycled or removed. (When |
508 | | - <acronym>WAL</acronym> archiving is being done, the log segments must be |
| 508 | + <acronym>WAL</acronym> archiving is being done, the WAL segments must be |
509 | 509 | archived before being recycled or removed.) |
510 | 510 | </para> |
511 | 511 |
|
|
544 | 544 | another factor to consider. To ensure data page consistency, |
545 | 545 | the first modification of a data page after each checkpoint results in |
546 | 546 | logging the entire page content. In that case, |
547 | | - a smaller checkpoint interval increases the volume of output to the WAL log, |
| 547 | + a smaller checkpoint interval increases the volume of output to the WAL, |
548 | 548 | partially negating the goal of using a smaller interval, |
549 | 549 | and in any case causing more disk I/O. |
550 | 550 | </para> |
|
614 | 614 | <para> |
615 | 615 | The number of WAL segment files in <filename>pg_wal</filename> directory depends on |
616 | 616 | <varname>min_wal_size</varname>, <varname>max_wal_size</varname> and |
617 | | - the amount of WAL generated in previous checkpoint cycles. When old log |
| 617 | + the amount of WAL generated in previous checkpoint cycles. When old WAL |
618 | 618 | segment files are no longer needed, they are removed or recycled (that is, |
619 | 619 | renamed to become future segments in the numbered sequence). If, due to a |
620 | | - short-term peak of log output rate, <varname>max_wal_size</varname> is |
| 620 | + short-term peak of WAL output rate, <varname>max_wal_size</varname> is |
621 | 621 | exceeded, the unneeded segment files will be removed until the system |
622 | 622 | gets back under this limit. Below that limit, the system recycles enough |
623 | 623 | WAL files to cover the estimated need until the next checkpoint, and |
|
650 | 650 | which are similar to checkpoints in normal operation: the server forces |
651 | 651 | all its state to disk, updates the <filename>pg_control</filename> file to |
652 | 652 | indicate that the already-processed WAL data need not be scanned again, |
653 | | - and then recycles any old log segment files in the <filename>pg_wal</filename> |
| 653 | + and then recycles any old WAL segment files in the <filename>pg_wal</filename> |
654 | 654 | directory. |
655 | 655 | Restartpoints can't be performed more frequently than checkpoints on the |
656 | 656 | primary because restartpoints can only be performed at checkpoint records. |
|
676 | 676 | insertion) at a time when an exclusive lock is held on affected |
677 | 677 | data pages, so the operation needs to be as fast as possible. What |
678 | 678 | is worse, writing <acronym>WAL</acronym> buffers might also force the |
679 | | - creation of a new log segment, which takes even more |
| 679 | + creation of a new WAL segment, which takes even more |
680 | 680 | time. Normally, <acronym>WAL</acronym> buffers should be written |
681 | 681 | and flushed by an <function>XLogFlush</function> request, which is |
682 | 682 | made, for the most part, at transaction commit time to ensure that |
683 | 683 | transaction records are flushed to permanent storage. On systems |
684 | | - with high log output, <function>XLogFlush</function> requests might |
| 684 | + with high WAL output, <function>XLogFlush</function> requests might |
685 | 685 | not occur often enough to prevent <function>XLogInsertRecord</function> |
686 | 686 | from having to do writes. On such systems |
687 | 687 | one should increase the number of <acronym>WAL</acronym> buffers by |
|
724 | 724 | <varname>commit_delay</varname>, so this value is recommended as the |
725 | 725 | starting point to use when optimizing for a particular workload. While |
726 | 726 | tuning <varname>commit_delay</varname> is particularly useful when the |
727 | | - WAL log is stored on high-latency rotating disks, benefits can be |
| 727 | + WAL is stored on high-latency rotating disks, benefits can be |
728 | 728 | significant even on storage media with very fast sync times, such as |
729 | 729 | solid-state drives or RAID arrays with a battery-backed write cache; |
730 | 730 | but this should definitely be tested against a representative workload. |
|
828 | 828 | <para> |
829 | 829 | <acronym>WAL</acronym> is automatically enabled; no action is |
830 | 830 | required from the administrator except ensuring that the |
831 | | - disk-space requirements for the <acronym>WAL</acronym> logs are met, |
| 831 | + disk-space requirements for the <acronym>WAL</acronym> files are met, |
832 | 832 | and that any necessary tuning is done (see <xref |
833 | 833 | linkend="wal-configuration"/>). |
834 | 834 | </para> |
835 | 835 |
|
836 | 836 | <para> |
837 | 837 | <acronym>WAL</acronym> records are appended to the <acronym>WAL</acronym> |
838 | | - logs as each new record is written. The insert position is described by |
| 838 | + files as each new record is written. The insert position is described by |
839 | 839 | a Log Sequence Number (<acronym>LSN</acronym>) that is a byte offset into |
840 | | - the logs, increasing monotonically with each new record. |
| 840 | + the WAL, increasing monotonically with each new record. |
841 | 841 | <acronym>LSN</acronym> values are returned as the datatype |
842 | 842 | <link linkend="datatype-pg-lsn"><type>pg_lsn</type></link>. Values can be |
843 | 843 | compared to calculate the volume of <acronym>WAL</acronym> data that |
|
846 | 846 | </para> |
847 | 847 |
|
848 | 848 | <para> |
849 | | - <acronym>WAL</acronym> logs are stored in the directory |
| 849 | + <acronym>WAL</acronym> files are stored in the directory |
850 | 850 | <filename>pg_wal</filename> under the data directory, as a set of |
851 | 851 | segment files, normally each 16 MB in size (but the size can be changed |
852 | 852 | by altering the <option>--wal-segsize</option> <application>initdb</application> option). Each segment is |
853 | 853 | divided into pages, normally 8 kB each (this size can be changed via the |
854 | | - <option>--with-wal-blocksize</option> configure option). The log record headers |
| 854 | + <option>--with-wal-blocksize</option> configure option). The WAL record headers |
855 | 855 | are described in <filename>access/xlogrecord.h</filename>; the record |
856 | 856 | content is dependent on the type of event that is being logged. Segment |
857 | 857 | files are given ever-increasing numbers as names, starting at |
|
861 | 861 | </para> |
862 | 862 |
|
863 | 863 | <para> |
864 | | - It is advantageous if the log is located on a different disk from the |
| 864 | + It is advantageous if the WAL is located on a different disk from the |
865 | 865 | main database files. This can be achieved by moving the |
866 | 866 | <filename>pg_wal</filename> directory to another location (while the server |
867 | 867 | is shut down, of course) and creating a symbolic link from the |
|
877 | 877 | on the disk. A power failure in such a situation might lead to |
878 | 878 | irrecoverable data corruption. Administrators should try to ensure |
879 | 879 | that disks holding <productname>PostgreSQL</productname>'s |
880 | | - <acronym>WAL</acronym> log files do not make such false reports. |
| 880 | + <acronym>WAL</acronym> files do not make such false reports. |
881 | 881 | (See <xref linkend="wal-reliability"/>.) |
882 | 882 | </para> |
883 | 883 |
|
884 | 884 | <para> |
885 | | - After a checkpoint has been made and the log flushed, the |
| 885 | + After a checkpoint has been made and the WAL flushed, the |
886 | 886 | checkpoint's position is saved in the file |
887 | 887 | <filename>pg_control</filename>. Therefore, at the start of recovery, |
888 | 888 | the server first reads <filename>pg_control</filename> and |
889 | 889 | then the checkpoint record; then it performs the REDO operation by |
890 | | - scanning forward from the log location indicated in the checkpoint |
| 890 | + scanning forward from the WAL location indicated in the checkpoint |
891 | 891 | record. Because the entire content of data pages is saved in the |
892 | | - log on the first page modification after a checkpoint (assuming |
| 892 | + WAL on the first page modification after a checkpoint (assuming |
893 | 893 | <xref linkend="guc-full-page-writes"/> is not disabled), all pages |
894 | 894 | changed since the checkpoint will be restored to a consistent |
895 | 895 | state. |
896 | 896 | </para> |
897 | 897 |
|
898 | 898 | <para> |
899 | 899 | To deal with the case where <filename>pg_control</filename> is |
900 | | - corrupt, we should support the possibility of scanning existing log |
| 900 | + corrupt, we should support the possibility of scanning existing WAL |
901 | 901 | segments in reverse order — newest to oldest — in order to find the |
902 | 902 | latest checkpoint. This has not been implemented yet. |
903 | 903 | <filename>pg_control</filename> is small enough (less than one disk page) |
|
0 commit comments