首页 » PostgreSQL/GaussDB » 如何在openGauss/PostgreSQL手动清理XLOG/WAL 文件?

如何在openGauss/PostgreSQL手动清理XLOG/WAL 文件?

openGauss/PostgreSQL中的预写式日志WAL(Write Ahead Log),又名Xlog或redo log,相当于oracle的online redo log, 不同的是oracle online redo log是提前创建几组滚动使用,但在opengauss中只需要本配置参数控制WAL日志的周期,数据库会一直的创建并自动清理,但存在一些情况WAL日志未清理导致目录空间耗尽,或目录空间紧张时手动删除wal日志时,比如如何确认在非归档模式下哪些WAL日志文件可以安全删除?

数据库参数
先看几个数据库实例控制的与wal日志相关的, 这里是这的测试环境openGauss 5.0,默认值。

openGauss=# \! sh show wal
 enable_mix_replication                 | off                | All the replication log sent by the wal streaming.
 enable_wal_shipping_compression        | off                | enable compress xlog during xlog shipping.
 hadr_max_size_for_xlog_receiver        | 256GB              | This param set by user is used for xlog to stop receiving when the gap is larger than this param between replay xlog location and walreceiver r
 max_wal_senders                        | 4                  | Sets the maximum number of simultaneously running WAL sender processes.
 wal_block_size                         | 8192               | Shows the block size in the write ahead log.
 wal_buffers                            | 16MB               | Sets the number of disk-page buffers in shared memory for WAL.
 wal_file_init_num                      | 10                 | Sets the number of xlog segment files that WAL writer auxiliary thread creates at one time.
 wal_flush_delay                        | 1                  | set delay time when iterator table entry.
 wal_flush_timeout                      | 2                  | set timeout when iterator table entry.
 wal_keep_segments                      | 16                 | Sets the number of WAL files held for standby servers.
 wal_level                              | hot_standby        | Sets the level of information written to the WAL.
 wal_log_hints                          | on                 | Writes full pages to WAL when first modified after a checkpoint, even for a non-critical modifications.
 wal_receiver_buffer_size               | 64MB               | Sets the buffer size to receive data from master.
 wal_receiver_connect_retries           | 1                  | Sets the maximum retries to connect master.
 wal_receiver_connect_timeout           | 2s                 | Sets the maximum wait time to connect master.
 wal_receiver_status_interval           | 5s                 | Sets the maximum interval between WAL receiver status reports to the primary.
 wal_receiver_timeout                   | 6s                 | Sets the maximum wait time to receive data from master.
 wal_segment_size                       | 16MB               | Shows the number of pages per write ahead log segment.
 wal_sender_timeout                     | 6s                 | Sets the maximum time to wait for WAL replication.
 wal_sync_method                        | fdatasync          | Selects the method used for forcing WAL updates to disk.
 wal_writer_delay                       | 200ms              | WAL writer sleep time between WAL flushes.
 walsender_max_send_size                | 8MB                | Size of walsender max send size.
 walwriter_cpu_bind                     | -1                 | Sets the binding CPU number for the WAL writer thread.
 walwriter_sleep_threshold              | 500                | Number of idle xlog flushes before xlog flusher goes to sleep.

openGauss=# \! sh show xlog
 advance_xlog_file_num                  | 0                  | Sets the number of xlog files to be initialized in advance.
 archive_timeout                        | 0                  | Forces a switch to the next xlog file if a new file has not been started within N seconds.
 enable_wal_shipping_compression        | off                | enable compress xlog during xlog shipping.
 enable_xlog_prune                      | on                 | Enable xlog prune when not all standys connected and xlog size is largger than max_xlog_size
 hadr_max_size_for_xlog_receiver        | 256GB              | This param set by user is used for xlog to stop receiving when the gap is larger than this param between replay xlog location and walreceiver r
 max_size_for_xlog_prune                | 2147483647kB       | This param set by user is used for xlog to be recycled when not all are connected and the param enable_xlog_prune is on.
 recovery_parse_workers                 | 1                  | The number of recovery threads to do xlog parse.
 recovery_redo_workers                  | 1                  | The number belonging to one parse worker to do xlog redo.
 wal_file_init_num                      | 10                 | Sets the number of xlog segment files that WAL writer auxiliary thread creates at one time.
 walwriter_sleep_threshold              | 500                | Number of idle xlog flushes before xlog flusher goes to sleep.
 xlog_file_path                         |                    | use only one xlog file, the path of it
 xlog_file_size                         | 549755813888       | share storage xlog file size
 xlog_lock_file_path                    |                    | used to control write to xlog_file_path
 xloginsert_locks                       | 8                  | Sets the number of locks used for concurrent xlog insertions.

openGauss=# show archive_mode;
 archive_mode
--------------
 off
(1 row)

自动清理WAL
WAL日志并不是一直存在,会自动的清理。wal_keep_segments参数控制wal日志保留的个数,默认保留最近16个。“pg_xlog”目录下保留事务日志文件的最小数目。
另一个参数max_size_for_xlog_prune参数,在enable_xlog_prune打开时生效,如果有备机断连且xlog日志大小大于此阈值,则回收日志。自动清理时会在pg_log中记录:

[BACKEND] LOG:  attempting to remove WAL segments older than log file 000000010000000000000008

[BACKEND] LOG:  attempting to remove WAL segments older than log file 000000010000000000000009

我们看一下是哪16个

[og@oel7db1 pg_xlog]$ ls -lrt|cat -n
     1  total 425988
     2  drwx------ 2 og og     4096 May  8 11:36 archive_status
     3  -rw------- 1 og og 16777216 May 10 17:17 00000001000000000000001C
     4  -rw------- 1 og og 16777216 May 27 15:58 00000001000000000000001D
     5  -rw------- 1 og og 16777216 May 27 15:59 00000001000000000000001E
     6  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000001F
     7  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000020
     8  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000021
     9  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000022
    10  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000023
    11  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000A
    12  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000B
    13  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000C
    14  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000D
    15  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000E
    16  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000F
    17  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000010
    18  -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000011
    19  -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000012
    20  -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000013
    21  -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000014
    22  -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000015
    23  -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000016
    24  -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000017
    25  -rw------- 1 og og 16777216 May 27 16:47 000000010000000000000018
    26  -rw------- 1 og og 16777216 May 27 16:52 000000010000000000000019
    27  -rw------- 1 og og 16777216 May 27 16:52 00000001000000000000001A
    28  -rw------- 1 og og 16777216 May 27 16:53 00000001000000000000001B

note:
减去当前和之后的。

手动日志切换

相当于Oracle的alter system switch logfile;

penGauss=# select pg_switch_xlog();
 pg_switch_xlog
----------------
 0/F000168
(1 row)

openGauss=# select pg_switch_xlog();
 pg_switch_xlog
----------------
 0/10000168
(1 row)

查看当前的WAL日志

# DB内部
openGauss=# select * from pg_ls_waldir() order by modification asc;
           name           |   size   |      modification
--------------------------+----------+------------------------
 000000010000000000000001 | 16777216 | 2023-05-08 11:37:00+08
 000000010000000000000002 | 16777216 | 2023-05-10 17:17:55+08
 000000010000000000000003 | 16777216 | 2023-05-27 15:58:58+08
 000000010000000000000004 | 16777216 | 2023-05-27 15:59:02+08
 000000010000000000000005 | 16777216 | 2023-05-27 16:00:24+08
 000000010000000000000006 | 16777216 | 2023-05-27 16:00:27+08
 000000010000000000000007 | 16777216 | 2023-05-27 16:00:29+08
 000000010000000000000008 | 16777216 | 2023-05-27 16:00:31+08
 000000010000000000000009 | 16777216 | 2023-05-27 16:00:33+08
 00000001000000000000000A | 16777216 | 2023-05-27 16:00:34+08
 00000001000000000000000B | 16777216 | 2023-05-27 16:00:36+08
 00000001000000000000000C | 16777216 | 2023-05-27 16:00:38+08
 00000001000000000000000D | 16777216 | 2023-05-27 16:00:39+08
 00000001000000000000000E | 16777216 | 2023-05-27 16:00:41+08
 00000001000000000000000F | 16777216 | 2023-05-27 16:00:43+08
 000000010000000000000012 | 16777216 | 2023-05-27 16:00:44+08
 000000010000000000000013 | 16777216 | 2023-05-27 16:00:44+08
 000000010000000000000014 | 16777216 | 2023-05-27 16:00:45+08
 000000010000000000000016 | 16777216 | 2023-05-27 16:00:45+08
 000000010000000000000010 | 16777216 | 2023-05-27 16:00:45+08
 000000010000000000000015 | 16777216 | 2023-05-27 16:00:45+08
 000000010000000000000017 | 16777216 | 2023-05-27 16:00:46+08
 000000010000000000000019 | 16777216 | 2023-05-27 16:00:46+08
 000000010000000000000018 | 16777216 | 2023-05-27 16:00:46+08
 00000001000000000000001A | 16777216 | 2023-05-27 16:00:47+08
 000000010000000000000011 | 16777216 | 2023-05-27 16:03:45+08
(26 rows)


# 操作系统
[og@oel7db1 pg_xlog]$ ls -lrt|cat -n
     1  total 425988
     2  drwx------ 2 og og     4096 May  8 11:36 archive_status
     3  -rw------- 1 og og 16777216 May  8 11:37 000000010000000000000001
     4  -rw------- 1 og og 16777216 May 10 17:17 000000010000000000000002
     5  -rw------- 1 og og 16777216 May 27 15:58 000000010000000000000003
     6  -rw------- 1 og og 16777216 May 27 15:59 000000010000000000000004
     7  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000005
     8  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000006
     9  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000007
    10  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000008
    11  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000009
    12  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000A
    13  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000B
    14  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000C
    15  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000D
    16  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000E
    17  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000000F
    18  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000012
    19  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000013
    20  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000014
    21  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000015
    22  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000010
    23  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000016
    24  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000017
    25  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000018
    26  -rw------- 1 og og 16777216 May 27 16:00 000000010000000000000019
    27  -rw------- 1 og og 16777216 May 27 16:00 00000001000000000000001A
    28  -rw------- 1 og og 16777216 May 27 16:06 000000010000000000000011


日志文件命名格式

000000010000000000000011
WAL日志三个组成部分:
第一部分,叫做时间线,是从1开始递增地数字(类似ORACLE 的SCN的wrap), 当低位满时,前一段加1.
第二部分,叫做LogId,是LSN的高32位(从0开始)
第三部分,叫做LogSeg,是LSN的低32位除以WAL文件的大小,WAL文件的大小默认是16M


WAL日志手动清理

[og@oel7db1 ~]$ ps -ef|grep gauss
og        1989     1  1 11:48 pts/0    00:04:13 gaussdb -D /opensource/opengauss/5.0/db --single_node
og       26086 25871  0 15:59 pts/2    00:00:00 grep --color=auto gauss


[og@oel7db1 pg_xlog]$ pg_controldata  /opensource/opengauss/5.0/db/
pg_control version number:            923
Catalog version number:               201611171
Database system identifier:           576503776236165954
Database cluster state:               in production
pg_control last modified:             Sat 27 May 2023 04:01:45 PM CST
Latest checkpoint location:           0/110001C8
Prior checkpoint location:            0/110000A8
Latest checkpoint's REDO location:    0/11000148
Latest checkpoint's TimeLineID:       1
Latest checkpoint's full_page_writes: off
Latest checkpoint's NextXID:          16617
Latest checkpoint's NextOID:          16431
Latest checkpoint's NextMultiXactId:  2
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        12664
Latest checkpoint's oldestXID's DB:   15645
Latest checkpoint's oldestActiveXID:  16617
Latest checkpoint's remove lsn:       0/1
Time of latest checkpoint:            Sat 27 May 2023 04:01:45 PM CST
Minimum recovery ending location:     0/0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
Current wal_level setting:            hot_standby
Current max_connections setting:      250
Current max_prepared_xacts setting:   200
Current max_locks_per_xact setting:   256
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Database system TimeLine:             8

openGauss=# select pg_xlogfile_name('0/110001C8');
     pg_xlogfile_name
--------------------------
 000000010000000000000011
(1 row)

Note:
说明000000010000000000000011(#11)之前的wal日志都可以清理。

可以手动rm 也可以使用pg_archivecleanup,如果当前opengauss中没有该工具,可以从postgresql中复制,目前在openGauss的商业发行版Mogdb中有该工具的说明。 pg_archivecleanup 是一个用于清理旧的归档日志的工具。

pg_archivecleanup [OPTION]... ARCHIVELOCATION OLDESTKEPTWALFILE

e.g.
pg_archivecleanup -d /opensource/opengauss/5.0/db/  000000010000000000000011

主库查看Xlog位置
主库也可以使用pg_current_xlog_location函数查看当前的xlog位置,使用pg_xlogfile_name转换日志文件名。如

openGauss=# select pg_current_xlog_location(),pg_xlogfile_name(pg_current_xlog_location()),txid_current(),now();
 
pg_current_xlog_location |     pg_xlogfile_name     | txid_current |              now
--------------------------+--------------------------+--------------+-------------------------------
 0/200001E8               | 000000010000000000000020 |        21140 | 2022-10-03 14:09:52.749551+08
(1 row)

备库查看 xlog位置

主从同步的模式下,在standby端可以使用 pg_last_xlog_receive_location() 、pg_last_xlog_replay_location()、pg_last_xact_replay_timestamp()分别查看xlog接受到的位置,应用到的位置,应用到的时间。如果想利用move走xlog控制应用到的位置,可以使用 pg_xlog_replay_pause() pg_xlog_replay_resume()对应用XLOG的进度暂停与恢复应用,如果下一下XLOG不存在,日志中会提示无法open 某xlog,而停止。 注意:如果因缺少xlog standby出现实例crash, 在启动时一定注意增加standby参数,因为opengauss的垃圾设计,是依赖进程启动模式判断database role, 直接启动standby 实例(gs_ctl start 不带-M standby)相当于激活db, 也就是oracle的failover standby, 没有办法再回退到standby模式。

打赏

, , ,

对不起,这篇文章暂时关闭评论。