首页 » GoldenGate » OGG 12.2 extract not moving when redo log files are placed on RAW device AIX (解决办法)

OGG 12.2 extract not moving when redo log files are placed on RAW device AIX (解决办法)

近期在配置OGG环境中遇到了个问题,相同ogg和oracle rdbms版本,HPUX和Linux 平台运行正常, 只是AIX 平台的ogg extract进程没有进度, 启动时initial就要好久,同时也未报错, redo文件使用的是RAW device(裸设备), OGG release 12.2 ,   Oracle RDBMS release 11.2.0.3.  在OGG 12以前我们都知道存在一个参数RawDeviceOffset,但是12版本中该参数不存在了,当卷组使用了 -TO 选项时需要配置。

RAWDEVICEOFFSET <n>
Directs Extract where to start reading on a raw device. For most operating systems, the offset is 0, but on the AIX platform it is 4096 bytes. RAWDEVICEOFFSET 0 must be used if the customer’s device is configured with the -TO option. The -TO option directs Oracle not to skip the first 4096 bytes of the logical volume, but start writing at 0. (When Oracle skips the first 4096, it can cause problems with fractured blocks.) RAWDEVICEOFFSET 0 directs Extract to start reading where Oracle starts.

但是到了OGG 12版本时发现该参数已不存在, 开始我们认为OGG新版本难道更智能要以自动判断? 结果查了好久,我们认为错了。下面记录一下问题现象。

先简单的配置extract 环境:

SQL> create table dbmt.t_anbob
  2  (id int,name varchar2(200));

Table created.


[anbob1:oracle:/zwb_arch1/ogg>ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 12.2.0.1.160419 23261684
AIX 6, ppc, 64bit (optimized), Oracle 11g on Jul 10 2016 22:21:03
Operating system character set identified as ISO-8859-1.
Copyright (C) 1995, 2016, Oracle and/or its affiliates. All rights reserved.

GGSCI (anbob1) 1> info mgr
Manager is running (IP port anbob1.7809, Process ID 13566206).

GGSCI (anbob1 as ggadmin@acctb1) 2> add trandata dbmt.t_anbob
2016-08-26 14:27:39  WARNING OGG-06439  No unique key is defined for table T_ANBOB. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
Logging of supplemental redo data enabled for table DBMT.T_ANBOB.
TRANDATA for scheduling columns has been added on table 'DBMT.T_ANBOB'.
TRANDATA for instantiation CSN has been added on table 'DBMT.T_ANBOB'.

GGSCI (anbob1 as ggadmin@acctb1) 3> info trandata dbmt.t_anbob
Logging of supplemental redo log data is enabled for table DBMT.T_ANBOB.
Columns supplementally logged for table DBMT.T_ANBOB: ID, NAME.
Prepared CSN for table DBMT.T_ANBOB: 14943044249987
encrypt password ggadmin_dba_123 encryptkey default

GGSCI (anbob1 as ggadmin@acctb1) 4> edit params e_anbob1
extract e_anbob1
------------------------------------------
-- local extract for anbob schema change
-- date:2016-8-24
-- zhangweizhao
-----------------------------------------
userid oggmgr, password "AACAAAAAAAAAAAPAUGFISAVCSBSAVFTHTCAAKINDXBQDIGQF", encryptkey default

discardFile dirrpt/e_anbob1.dsc, append
discardRollover at 02:00 on sunday
exttrail dirdat/l1
table DBMT.T_ANBOB;

GGSCI (dev-db) 6>add extract e_anbob1, tranlog, THREADS 2, BEGIN now
EXTRACT added.

GGSCI (dev-db) 7> add exttrail dirdat/l1, extract e_anbob1, megabytes 100
EXTTRAIL added.

启动EXTRACT进程,检查状态

GGSCI (anbob1) 5> start ext e_anbob1
Sending START request to MANAGER ...
EXTRACT E_ANBOB1 starting


GGSCI (anbob1) 16> info e_anbob1

EXTRACT    E_ANBOB1  Last Started 2016-08-26 15:19   Status RUNNING
Checkpoint Lag       00:10:28 (updated 00:00:01 ago)
Process ID           7143634
Log Read Checkpoint  Oracle Redo Logs
                     2016-08-26 15:10:20  Thread 1, Seqno 119096, RBA 319654740
                     SCN 3479.853036048 (14943044258832)
Log Read Checkpoint  Oracle Redo Logs
                     2016-08-26 15:10:21  Thread 2, Seqno 137587, RBA 144129552
                     SCN 3479.853036084 (14943044258868)


GGSCI (anbob1) 17> !
info e_anbob1

EXTRACT    E_ANBOB1  Last Started 2016-08-26 15:19   Status RUNNING
Checkpoint Lag       00:10:38 (updated 00:00:04 ago)
Process ID           7143634
Log Read Checkpoint  Oracle Redo Logs
                     2016-08-26 15:10:20  Thread 1, Seqno 119096, RBA 319654740
                     SCN 3479.853036048 (14943044258832)
Log Read Checkpoint  Oracle Redo Logs
                     2016-08-26 15:10:21  Thread 2, Seqno 137587, RBA 144129552
                     SCN 3479.853036084 (14943044258868)

INSERT 10条记录观察

SQL> insert into dbmt.t_anbob select rownum,'anbob' from dual connect by rownum<=10; 10 rows created. SQL> commit;
Commit complete.

GGSCI (anbob1) 34> STATS EXTRACT e_anbob1, TOTAL, HOURLY,
Sending STATS request to EXTRACT E_ANBOB1 ...
Start of Statistics at 2016-08-26 15:28:40.
Output to dirdat/l1:

Extracting from GGADMIN.GG_HEARTBEAT_SEED to GGADMIN.GG_HEARTBEAT_SEED:

*** Total statistics since 2016-08-26 15:19:53 ***
        Total inserts                                      0.00
        Total updates                                      1.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                   1.00

*** Hourly statistics since 2016-08-26 15:19:53 ***
        Total inserts                                      0.00
        Total updates                                      1.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                   1.00

End of Statistics.

GGSCI (anbob1) 36> info ext e_anbob1, detail

EXTRACT    E_ANBOB1  Last Started 2016-08-26 15:19   Status RUNNING
Checkpoint Lag       00:18:43 (updated 00:00:05 ago)
Process ID           7143634
Log Read Checkpoint  Oracle Redo Logs
                     2016-08-26 15:10:20  Thread 1, Seqno 119096, RBA 319654740
                     SCN 3479.853036048 (14943044258832)
Log Read Checkpoint  Oracle Redo Logs
                     2016-08-26 15:10:21  Thread 2, Seqno 137587, RBA 144129552
                     SCN 3479.853036084 (14943044258868)

Note:
注意我之前insert 的记录并未抽到,并且RBA没有变化, 查看了进程并未报错, 下面我们用truss 跟踪extract进程.

[anbob1:oracle:/zwb_arch1/ogg>ps -ef|grep anbob
  oracle  7143634  5374100   0 15:19:45      -  0:00 /zwb_arch1/ogg/extract PARAMFILE /zwb_arch1/ogg/dirprm/e_anbob1.prm REPORTFILE /zwb_arch1/ogg/dirrpt/E_ANBOB1.rpt PROCESSID E_ANBOB1 USESUBDIRS
  oracle 13172948 14352390   0 15:30:36  pts/2  0:00 grep anbob
  
  
anbob1:/> truss -d -p 7143634
  
7.9706:        access("gglog-E_ANBOB1.xml", 04) Err#2  ENOENT
7.9708:        access("/zwb_arch1/ogg/gglog-E_ANBOB1.xml", 04) Err#2  ENOENT
7.9710:        access("/zwb_arch1/ogg/gglog-E_ANBOB1.xml", 04) Err#2  ENOENT
7.9711:        access("gglog-extract.xml", 04)  Err#2  ENOENT
7.9713:        access("/zwb_arch1/ogg/gglog-extract.xml", 04) Err#2  ENOENT
7.9715:        access("/zwb_arch1/ogg/gglog-extract.xml", 04) Err#2  ENOENT
7.9717:        access("gglog.xml", 04)          Err#2  ENOENT
7.9718:        access("/zwb_arch1/ogg/gglog.xml", 04) Err#2  ENOENT
7.9720:        access("/zwb_arch1/ogg/gglog.xml", 04) Err#2  ENOENT
7.8907:        thread_tsleep(0, 0x09001000A02DC490, 0x0000000000000000, 0x0000000122801C90) = 0
7.9308:        thread_tsleep(0, 0x09001000A02DC400, 0x0000000000000000, 0x00000001106B9C90) = 0
7.9076:        thread_tsleep(0, 0x09001000A02DC4A0, 0x0000000000000000, 0x0000000122C54C90) = 0
8.0008:        thread_tsleep(0, 0x09001000A02DC400, 0x0000000000000000, 0x00000001106B9C90) = 0
7.9955:        thread_tsleep(0, 0x09001000A02DC490, 0x0000000000000000, 0x0000000122801C90) = 0
7.1172:        _nsleep(0x0000000121FCBD80, 0x0000000121FCBE50) = 0
8.0081:        thread_tsleep(0, 0x09001000A02DC4A0, 0x0000000000000000, 0x0000000122C54C90) = 0
8.0708:        thread_tsleep(0, 0x09001000A02DC400, 0x0000000000000000, 0x00000001106B9C90) = 0
7.1523:        thread_tsleep(0, 0x09001000A02DC3F0, 0x0000000000000000, 0x0000000110658850) = 0
8.0978:        thread_tsleep(0, 0x09001000A02DC490, 0x0000000000000000, 0x0000000122801C90) = 0
8.1408:        thread_tsleep(0, 0x09001000A02DC400, 0x0000000000000000, 0x00000001106B9C90) = 0
8.1178:        thread_tsleep(0, 0x09001000A02DC4A0, 0x0000000000000000, 0x0000000122C54C90) = 0
8.2108:        thread_tsleep(0, 0x09001000A02DC400, 0x0000000000000000, 0x00000001106B9C90) = 0
...
anbob1:/> truss -d  -o ext.truss -p 7143634
Pstatus: process is not stopped
anbob1:/> grep lseek ext.truss
0.0186:        lseek(21, 4608, 0)               = 4608
0.1149:        lseek(22, 4608, 0)               = 4608
1.0197:        lseek(21, 4608, 0)               = 4608
1.1155:        lseek(22, 4608, 0)               = 4608
2.0207:        lseek(21, 4608, 0)               = 4608
2.1164:        lseek(22, 4608, 0)               = 4608
2.1980:        lseek(18, 0, 0)                  = 0
3.0218:        lseek(21, 4608, 0)               = 4608
3.1173:        lseek(22, 4608, 0)               = 4608
4.0228:        lseek(21, 4608, 0)               = 4608
4.1182:        lseek(22, 4608, 0)               = 4608
5.0320:        lseek(21, 4608, 0)               = 4608

Note:
lseek函数 Moves the read-write file pointer.
off_t lseek ( FileDescriptor, Offset, Whence)
当前的extract进程是在循环读,sleep 然后调用lseek 定位到4k+ 1block(512), 可以判断EXTRACT是跳过了4K 的保留区,我们确认当前数据库REDO 的VG配置。

anbob1:/home/oracle> dbfsize /dev/rzwa_redo01

Database file: /dev/rzwa_redo01
Database file type: raw device without 4K starting offset
Database file size: 9172992 512 byte blocks

anbob1:/home/oracle> offset /dev/rzwa_redo01
0

Note:
确认当前设备未配置4K的位移量。 问题又回来了,OGG 12里把RawDeviceOffset 的参数取消了,MOS 和 GOOLE中目前没有任何记录关于该参数的替代方法, 只能找原厂的人咨询, 结果发现该参数被隐藏了,现在OGG 12中的参数是_RawDeviceOffset(前带下划线). 下面修改参数重启测试 .

GGSCI (anbob1) 2> edit params E_ANBOB1
"/zwb_arch1/ogg/dirprm/e_anbob1.prm" 13 lines, 413 characters 
extract e_anbob1
------------------------------------------
-- local extract for anbob schema change
-- date:2016-8-24
-- zhangweizhao
-----------------------------------------
setenv(ORACLE_SID="acctb1")
USERID ggadmin, PASSWORD AACAAAAAAAAAAAPAUGFISAVCSBSAVFTHTCAAKINDXBQDIGQF,ENCRYPTKEY DEFAULT
TranlogOptions _RawDeviceOffset 0
discardFile dirrpt/e_anbob1.dsc, append
discardRollover at 02:00 on sunday
exttrail dirdat/l1
table DBMT.T_ANBOB;

GGSCI (anbob1) 4> stop ext E_ANBOB1
Sending STOP request to EXTRACT E_ANBOB1 ...
Request processed.

GGSCI (anbob1) 11> start ext E_ANBOB1
Sending START request to MANAGER ...
EXTRACT E_ANBOB1 starting


GGSCI (anbob1) 15> info ext E_ANBOB1
EXTRACT    E_ANBOB1  Last Started 2016-08-26 18:05   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:07 ago)
Process ID           14549378
Log Read Checkpoint  Oracle Redo Logs
                     2016-08-26 18:06:21  Thread 1, Seqno 119097, RBA 23774720
                     SCN 3479.853088956 (14943044311740)
Log Read Checkpoint  Oracle Redo Logs
                     2016-08-26 18:06:22  Thread 2, Seqno 137589, RBA 21712384
                     SCN 3479.853088969 (14943044311753)


GGSCI (anbob1) 18> STATS EXTRACT e_anbob1, TOTAL, HOURLY
Sending STATS request to EXTRACT E_ANBOB1 ...
Start of Statistics at 2016-08-26 18:07:13.
Output to dirdat/l1:
Extracting from GGADMIN.GG_HEARTBEAT_SEED to GGADMIN.GG_HEARTBEAT_SEED:

*** Total statistics since 2016-08-26 18:05:32 ***
        Total inserts                                      0.00
        Total updates                                    176.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                 176.00

*** Hourly statistics since 2016-08-26 18:05:32 ***
        Total inserts                                      0.00
        Total updates                                    176.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                 176.00

Extracting from DBMT.T_ANBOB to DBMT.T_ANBOB:
*** Total statistics since 2016-08-26 18:05:32 ***
        Total inserts                                     10.00
        Total updates                                      0.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                  10.00
*** Hourly statistics since 2016-08-26 18:05:32 ***
        Total inserts                                     10.00
        Total updates                                      0.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                  10.00

End of Statistics.

SQL> insert into dbmt.t_anbob select rownum,'anbob2' from dual connect by rownum<=40; 40 rows created. 
SQL> commit;
Commit complete.

GGSCI (anbob1) 19> STATS EXTRACT e_anbob1, TOTAL, HOURLY
Sending STATS request to EXTRACT E_ANBOB1 ...
Start of Statistics at 2016-08-26 18:08:59.
Output to dirdat/l1:
Extracting from GGADMIN.GG_HEARTBEAT_SEED to GGADMIN.GG_HEARTBEAT_SEED:
*** Total statistics since 2016-08-26 18:05:32 ***
        Total inserts                                      0.00
        Total updates                                    177.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                 177.00

*** Hourly statistics since 2016-08-26 18:05:32 ***
        Total inserts                                      0.00
        Total updates                                    177.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                 177.00

Extracting from DBMT.T_ANBOB to DBMT.T_ANBOB:

*** Total statistics since 2016-08-26 18:05:32 ***
        Total inserts                                     50.00
        Total updates                                      0.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                  50.00

*** Hourly statistics since 2016-08-26 18:05:32 ***
        Total inserts                                     50.00
        Total updates                                      0.00
        Total deletes                                      0.00
        Total discards                                     0.00
        Total operations                                  50.00

End of Statistics.

Note:
可以看到现在一切正常,并且心跳表(ogg12 新特性)的数据也已同步, 此时如果truss跟踪都已正常read.

0.1004:        lseek(21, 21707776, 0)           = 21707776
0.1316:        lseek(23, 23770624, 0)           = 23770624
0.2344:        lseek(23, 23771136, 0)           = 23771136
0.3371:        lseek(23, 23771136, 0)           = 23771136
1.1045:        lseek(21, 21708288, 0)           = 21708288
...

如果你和我一样好奇lseek中的第一个参数的文件句柄对应哪个文件?下面记录一下AIX 如果通过文件名柄确认文件名。

anbob1:/home/oracle> ps -ef|grep ogg
  oracle 58458336        1  12   Aug 25      - 270:46 /zwa_arch1/ogg/extract PARAMFILE /zwa_arch1/ogg/dirprm/eaccta1.prm REPORTFILE /zwa_arch1/ogg/dirrpt/EACCTA1.rpt PROCESSID EACCTA1 USESUBDIRS
...

anbob1:/home/oracle> truss -d -o /tmp/ext.tuss -p 58458336
Pstatus: process is not stopped

anbob1:/home/oracle> grep seek /tmp/ext.tuss
0.0416:        lseek(21, 479729152, 0)          = 479729152
0.0005:        lseek(23, 3925202432, 0)         = 3925202432
...

anbob1:/home/oracle> procfiles 58458336
58458336 : /zwa_arch1/ogg/extract PARAMFILE /zwa_arch1/ogg/dirprm/eaccta1.prm REPORTFILE /
  Current rlimit: 2147483647 file descriptors
   1: S_IFREG mode:0640 dev:46,2 ino:20792 uid:1024 gid:1023 rdev:0,0
      O_RDWR size:0 
   2: S_IFREG mode:0640 dev:46,2 ino:20792 uid:1024 gid:1023 rdev:0,0
      O_RDWR size:0 
   3: S_IFREG mode:0640 dev:46,2 ino:20792 uid:1024 gid:1023 rdev:0,0
      O_RDWR size:0 
   ...   
   21: S_IFCHR mode:00 dev:10,4 ino:32278 uid:1024 gid:1023 rdev:47,8
      O_RDONLY 
   22: S_IFREG mode:0640 dev:46,2 ino:20896 uid:1024 gid:1023 rdev:0,0
      O_RDWR size:0 
   23: S_IFCHR mode:00 dev:10,4 ino:32468 uid:1024 gid:1023 rdev:48,7
      O_RDONLY 
      
anbob1:/dev> ls -l|egrep "47, *8"
...
brw-rw----    1 oracle   oinstall     47, 88 Mar 20 2014  zwa_lv15_035
brw-rw----    1 oracle   oinstall     47, 89 Mar 20 2014  zwa_lv15_036
brw-rw----    1 oracle   oinstall     47,  8 Mar 20 2014  zwa_redo02
anbob1:/dev> ls -l|egrep "47, *7"
...
brw-rw----    1 oracle   oinstall     47, 79 Mar 20 2014  zwa_lv15_026
brw-rw----    1 oracle   oinstall     47,  7 Mar 20 2014  zwa_redo01

Summary:
因为数据库的RAW 卷组创建时使用了-TO不会跳过4K,DB而是从0位开始写,而OGG的AIX环境与其它平台不同默认是从4k后开始读,OGG 12前使用RawDeviceOffset参数配置,而在OGG 12中该参数为隐藏参数_RawDeviceOffset, 配置后重启OGG ext进程解决。

— 另外应该在下个月的USA OOW 2016后OGG 12.3新版本应该会Release.

打赏

对不起,这篇文章暂时关闭评论。