首页 » GoldenGate » Goldengate Integrated Extract集成模式抽取挂起?

Goldengate Integrated Extract集成模式抽取挂起?

最近一套ogg环境反馈做了些磁盘维护后抽取进程abended, 查看ggs日志发现有提示OGG-00662 OCI ERROR ora-03113 ora-03115和OGG-02078 extract encountered a fatal error错误,启动后使用ggsci info ext xx查看RBA无变化,简单记录一下该案例。

什么是Goldengate Integrated Extract

Oracle 从 OGG 11.2.x 引入了 Integrated Extract 或 Integrated Capture。它们将此捕获与 Oracle RDBMS 绑定,因此称为集成捕获。相对应的Classic Capture 与 Oracle 数据库分离,而 Integrated Capture 与 Oracle 数据库紧密耦合。在这里,不是直接从数据库中捕获事务的提取过程,而是 Logmining Server 扮演的角色。LogMining Server将与数据库中的此集成捕获过程一起附加,以挖掘redo log或archivelog并捕获事务。
集成模式的抽取有以下子进程:
READER – 读取日志文件并拆分为区域
PREPARER – 根据提取参数扫描日志文件和预过滤器的区域
BUILDER – 按 SCN 顺序合并准备好的记录
CAPTURE – 格式化逻辑变更记录 (LCR) 并传递给 Oracle GoldenGate Extract

Oracle GoldenGate 支持所有 Oracle 数据和存储类型。但要做到这一点,源数据库应至少为 11.2.0.3,低的版本需要安装一些补丁。集成捕获此过程主要针对流概念。会占用或消耗STREAMS_POOL_SIZE内存.

案例

因为日志中看到了ora-3113 ora-3115怀疑是数据库实例发生了重启,从数据库v$instance.START_TIME确认DB实例确实刚刚重启。

ggsci> info xx , detail

Integrated Extract outbound server first scn: Unavailable.Need DBLOGIN.

Integrated Extract outbound server filtering start scn: Unavailable.Need DBLOGIN.

ggsci> DBLOGIN xxx

ggsci> stop mgr

ggsci> start mgr

ggsci> send extract xx , forcestop

ggsci> start extract xx

ggsci> view report xx

2023-12-19 18:14:07  INFO    OGG-01639  BOUNDED RECOVERY:  ACTIVE: for object pool 1: p93305_extr.
2023-12-19 18:14:07  INFO    OGG-01640  BOUNDED RECOVERY: recovery start XID: 1.45.58.16824898.
2023-12-19 18:14:07  INFO    OGG-01641  BOUNDED RECOVERY: recovery start position: SeqNo: 717484, RBA: 1394662416, SCN: 3254.3031471228 (13978855052412), Timestamp: 2023-12-19 14:54:13.000000.
2023-12-19 18:14:07  INFO    OGG-01642  BOUNDED RECOVERY: recovery end position: SeqNo: 916180, RBA: 405960548, SCN: 3254.3050473505 (13978874054689), Timestamp: 2023-12-19 15:45:22.000000, Thread: 1.

2023-12-19 18:14:07  INFO    OGG-01579  BOUNDED RECOVERY: VALID BCP: CP.EX_OL.000016487.
2023-12-19 18:14:07  INFO    OGG-01629  BOUNDED RECOVERY: PERSISTED OBJECTS RECOVERED: 21.

Database Version:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production

GGSCI (weejar1 as ggs@weejar1) 143> info ext_anb, detail

EXTRACT    ext_anb     Last Started 2023-12-19 18:07   Status RUNNING
Checkpoint Lag       01:01:49 (updated 00:16:59 ago)
Process ID           21074
Log Read Checkpoint  Oracle Integrated Redo Logs
                     2023-12-19 17:05:40
                     SCN 3254.3079387801 (13978902968985)


GGSCI (weejar1 as ggs@weejar1) 151> info ext_anb, showch

EXTRACT    ext_anb     Last Started 2023-12-19 18:07   Status RUNNING
Checkpoint Lag       01:29:49 (updated 00:00:02 ago)
Process ID           21074
Log Read Checkpoint  Oracle Integrated Redo Logs
                     2023-12-19 17:05:40
                     SCN 3254.3079387801 (13978902968985)


Current Checkpoint Detail:

Read Checkpoint #1

  Oracle Integrated Redo Log

  Startup Checkpoint (starting position in the data source):
    Timestamp: 2020-03-09 16:19:45.000000
    SCN: Not available

  Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
    Timestamp: 2023-04-07 16:45:19.000000
    SCN: 3219.225976818 (13825725702642)

  Current Checkpoint (position of last record read in the data source):
    Timestamp: 2023-12-19 17:05:40.000000
    SCN: 3254.3079387801 (13978902968985)

  BR Previous Recovery Checkpoint:
    Timestamp: 2023-12-13 12:33:41.000000
    SCN: 3253.4196616217 (13975725230105)

  BR Begin Recovery Checkpoint:
    Timestamp: 2023-12-19 14:54:13.000000
    SCN: 3254.3031471228 (13978855052412)

  BR End Recovery Checkpoint:
    Timestamp: 2023-12-19 15:46:04.000000
    SCN: 3254.3050736660 (13978874317844)

Write Checkpoint #1

  GGS Log Trail

  Current Checkpoint (current write position):
    Sequence #: 109303
    RBA: 43508204
    Timestamp: 2023-12-19 17:05:42.152798
    Extract Trail: ./dirdat/yms/ar
    Seqno Length: 9
    Flip Seqno Length: No
    Trail Type: EXTTRAIL

Header:
  Version = 2
  Record Source = A
  Type = 13
  # Input Checkpoints = 1
  # Output Checkpoints = 1

File Information:
  Block Size = 2048
  Max Blocks = 100
  Record Length = 2048
  Current Offset = 0

Configuration:
  Data Source = 3
  Transaction Integrity = 1
  Task Type = 0

Status:
  Start Time = 2023-12-19 18:07:56
  Last Update Time = 2023-12-19 18:35:29
  Stop Status = A
  Last Result = 520



GGSCI (weejar1 as ggs@weejar1) 152> info ext_anb, showch

EXTRACT    ext_anb     Last Started 2023-12-19 18:07   Status RUNNING
Checkpoint Lag       01:31:10 (updated 00:00:01 ago)
Process ID           21074
Log Read Checkpoint  Oracle Integrated Redo Logs
                     2023-12-19 17:05:40
                     SCN 3254.3079387801 (13978902968985)


Current Checkpoint Detail:

Read Checkpoint #1

  Oracle Integrated Redo Log

  Startup Checkpoint (starting position in the data source):
    Timestamp: 2020-03-09 16:19:45.000000
    SCN: Not available

  Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
    Timestamp: 2023-04-07 16:45:19.000000
    SCN: 3219.225976818 (13825725702642)

  Current Checkpoint (position of last record read in the data source):
    Timestamp: 2023-12-19 17:05:40.000000
    SCN: 3254.3079387801 (13978902968985)

  BR Previous Recovery Checkpoint:
    Timestamp: 2023-12-13 12:33:41.000000
    SCN: 3253.4196616217 (13975725230105)

  BR Begin Recovery Checkpoint:
    Timestamp: 2023-12-19 14:54:13.000000
    SCN: 3254.3031471228 (13978855052412)

  BR End Recovery Checkpoint:
    Timestamp: 2023-12-19 15:46:04.000000
    SCN: 3254.3050736660 (13978874317844)

Write Checkpoint #1

  GGS Log Trail

  Current Checkpoint (current write position):
    Sequence #: 109303
    RBA: 43508204
    Timestamp: 2023-12-19 17:05:42.152798
    Extract Trail: ./dirdat/yms/ar
    Seqno Length: 9
    Flip Seqno Length: No
    Trail Type: EXTTRAIL

Header:
  Version = 2
  Record Source = A
  Type = 13
  # Input Checkpoints = 1
  # Output Checkpoints = 1

File Information:
  Block Size = 2048
  Max Blocks = 100
  Record Length = 2048
  Current Offset = 0

Configuration:
  Data Source = 3
  Transaction Integrity = 1
  Task Type = 0

Status:
  Start Time = 2023-12-19 18:07:56
  Last Update Time = 2023-12-19 18:36:50
  Stop Status = A
  Last Result = 520


Bounded Recovery是通用提取检查点工具的一个组件。它保证了在Extract因任何原因(计划内或计划外)停止后的有效恢复,无论Extract停止时有多少打开(未提交)事务,也无论它们有多长时间。当Extract遇到事务的提交记录时,它将整个缓存的事务写入到trail中,并将其从内存中清除。当Extract遇到事务的回滚记录时,它将从内存中丢弃整个事务。在Extract处理提交或回滚之前,事务被认为是打开的,并继续收集其信息。如果Extract在遇到事务的提交或回滚记录之前停止,则在Extract再次启动时必须恢复所有缓存的信息。这适用于Extract停止时打开的所有事务。<OGG(12.3) Extract long time lag after Oracle RAC a instance Crash>之前记录过一个故障 .

Extract执行如下恢复操作:
如果在Extract停止时没有打开的事务,则从当前Extract读取检查点开始恢复。这是一次正常的recovery。
如果存在打开的事务,其日志中的起始点在时间上与Extract停止的时间非常接近,则Extract将通过从最老的打开事务开始重新读取日志来开始恢复。这就要求Extract为那些在Extract停止之前已经写入trail或丢弃的事务执行冗余工作,但是考虑到要处理的数据量相对较少,这些工作是可以接受的成本。这也被认为是正常的recovery。
如果Extract有一个或多个事务符合长时间运行的打开事务的条件,则Extract使用 Bounded Recovery开始其recovery。

在每个Bounded Recovery interval,Extract生成一个Bounded Recovery checkpoint,它将Extract的当前状态和数据持久化到磁盘,包括长时间运行的事务的状态和数据,如果Extract在Bounded Recovery checkpoint后停止,它将从上一个Bounded Recovery interval的位置或最后一个Bounded Recovery checkpoint进行恢复,而不是从最早的打开的长时间运行事务首次出现的日志位置进行处理。 maximum Bounded Recovery time(Extract恢复到停止位置的最大时间)不超过当前Bounded Recovery checkpoint interval的两倍。

简而言之,如果一个长事务一直没有结束,而OGG又是实时解析已commited的事务,不会等未提交的事务,而如是长事务结束前ogg extract要关闭重启,cache里的信息就会被丢弃,下次启动时如果从事务最开始取解析redo或很可能是归档,就导致反复读,因为这些日志之前已提交的事务已经解析并写入trail文件,效率有点低,所以引入了BR, 通过BR interval 参数控制多长时间做一次checkpoint, 而checkpoint完成后,之前开始的长事务在下次重启时,就不再需要从之前的归档开始,甚至不再需要之前的归档,优先读取BR文件。回想第一次接触BR好像是在9年前,BR文件出现问题导致extract挂起。

 

查看Recover的位置

GGSCI (weejar1 as ggs@weejar1) 155> send ex_anb status

Sending STATUS request to EXTRACT ex_anb ...
EXTRACT ex_anb (PID 30373)
Current status: In recovery[1]: Processing data

Current read position:
Redo thread #: 2
Sequence #: 717581
RBA: 186794744
Timestamp: 2023-12-19 16:30:56.000000
SCN: 3254.3066340389 (13978889921573)
Current write position:
Sequence #: 87327
RBA: 197816537
Timestamp: 2023-12-19 17:05:32.876775
Extract Trail: ./dirdat/yms/ol

GGSCI (weejar1 as ggs@weejar1) 156> !
send ex_anb status

Sending STATUS request to EXTRACT ex_anb ...
EXTRACT ex_anb (PID 30373)
Current status: In recovery[1]: Processing data

Current read position:
Redo thread #: 1
Sequence #: 916250
RBA: 102453108
Timestamp: 2023-12-19 16:33:02.000000
SCN: 3254.3066980486 (13978890561670)
Current write position:
Sequence #: 87327
RBA: 197816537
Timestamp: 2023-12-19 17:05:32.876775
Extract Trail: ./dirdat/yms/ol

Note:
可看read Timestamp和SCN在变化,等待恢复完成即可.

 

References
https://docs.oracle.com/en/middleware/goldengate/core/21.3/reference/br.html

Oracle GoldenGate – Integrated Capture

打赏

,

对不起,这篇文章暂时关闭评论。