首页 » ORACLE 9i-23c » Lots of Long transaction caused by database link, and undo hdr show DBA for that slot is 0x00000000

Lots of Long transaction caused by database link, and undo hdr show DBA for that slot is 0x00000000

部署GOLDENGATE时发现,当前库中存在较多的长事务,在v$transaction中显示状态一直是ACTIVE, 对于长事务的对OGG的BR或启动抽取位置有较大影响, 奇怪的是这些长事务的起动时间甚至都有3天以上,而且当前会话状态已是INACITVE.而且查看UNDO SEGMENT HEADER上对应的SLOT 的DBA是0x00000000。

ANBOB@SQL> SELECT xidusn,
  2         xidslot,
  3         xidsqn,
  4         ubafil,
  5         ubablk,
  6         ubasqn,
  7         ubarec,
  8         t.status,
  9         start_time,
 10         start_ubafil,
 11         start_ubablk,
 12         start_ubasqn,
 13         start_ubarec,
 14         ses_addr,
 15         used_ublk,
 16         used_urec,
 17         s.sql_id,sid
 18    FROM v$transaction t,v$session s 
 19   WHERE addr = '0700000A0D6B1308' and T.addr=S.TADDR
 20  ;

    XIDUSN    XIDSLOT     XIDSQN     UBAFIL     UBABLK     UBASQN     UBAREC STATUS           
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------- 
      2880         51    2938814          0          0          0          0 ACTIVE           
	  
START_TIME           START_UBAFIL START_UBABLK START_UBASQN START_UBAREC SES_ADDR         
-------------------- ------------ ------------ ------------ ------------ ---------------- 
08/23/16 05:38:03             663       410489        23081           50 0700000A0199F210 

 USED_UBLK  USED_UREC SQL_ID               SID
---------- ---------- ------------- ----------	  
         1          1                     3331	  
		 
ANBOB@SQL> SELECT
  2        ktuxeusn            usn#     -- 65535 = no-undo transaction
  3      , ktuxeslt            slot#    -- 65535 = invalid slot#
  4      , ktuxesqn            seq#
  5      , ktuxesta            status
  6      , ktuxecfl            flags
  7      , ktuxesiz            undo_blks
  8      , ktuxerdbf           curfile 
  9      , ktuxerdbb           curblock
 10      , ktuxescnw * power(2, 32) + ktuxescnb cscn -- commit/prepare commit SCN
 11      , ktuxeuel            
 12      -- distributed xacts
 13      --, ktuxeddbf           r_rfile
 14      --, ktuxeddbb           r_rblock
 15      , ktuxepusn           r_usn#
 16      , ktuxepslt           r_slot#
 17      , ktuxepsqn           r_seq#
 18    FROM
 19        x$ktuxe
 20    WHERE ktuxesta != 'INACTIVE' and ktuxeusn=2880
 21    ORDER BY
 22        ktuxeusn
 23      , ktuxeslt
 24    /

      USN#      SLOT#       SEQ# STATUS           FLAGS      UNDO_BLKS    CURFILE   CURBLOCK       CSCN   KTUXEUEL     R_USN#    R_SLOT#     R_SEQ#
---------- ---------- ---------- ---------------- --------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
      2880         51    2938814 ACTIVE           NONE               0          0          0 1.4942E+13          2          0          0          0
	  
 

NOTE:
UBAFIL,UBABLK,UBAREC 都为0, USED_UBLK为1, 而且该事务一直是ACTIVE 状态, start_time是三天前的时间。 SQL ID为空,通常我们是找出这个事务相关的对象和事务类型, 下面dump undo segment header.

ANBOB@SQL>  select segment_name from dba_rollback_segs where segment_id=2880;

SEGMENT_NAME
------------------------------
_SYSSMU2880_2077687819$

ANBOB@SQL> ALTER SYSTEM DUMP UNDO HEADER '_SYSSMU2880_2077687819$';

System altered.


  index  state cflags  wrap#    uel         scn            dba            parent-xid    nub     stmt_num    cmt
  ------------------------------------------------------------------------------------------------
   0x00    9    0x00  0x2d1821  0x0008  0x0d97.0fb42008  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901792
   0x01    9    0x00  0x2d1820  0x0032  0x0d97.0fb4d3db  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901798
   0x02    9    0x00  0x2d16df  0x002c  0x0d97.0f8e5c10  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901528
   0x03    9    0x00  0x2d181e  0x003c  0x0d97.0faafce2  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901697
   0x04    9    0x00  0x2d17ed  0x0037  0x0d97.0fb26ea1  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901771
   0x05    9    0x00  0x2d17bc  0x0027  0x0d97.0fa19bc3  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901638
   0x06    9    0x00  0x2d063b  0xffff  0x0d99.305693fb  0x00000000  0x0000.000.00000000  0x00000000   0x00000000  1472687730
...
   0x2e    9    0x00  0x2d1823  0x001b  0x0d97.0fbb6dd3  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901861
   0x2f    9    0x00  0x2d1832  0x0038  0x0d97.b134f1bc  0x00000000  0x0000.000.00000000  0x00000000   0x00000000  1472122507
   0x30    9    0x00  0x2d1811  0x003f  0x0d98.75d8e475  0x00000000  0x0000.000.00000000  0x00000000   0x00000000  1472424937
   0x31    9    0x00  0x2d1550  0x003e  0x0d97.0fbddba4  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901882
   0x32    9    0x00  0x2d181f  0x0044  0x0d97.0fb5ca7a  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901808
   0x33   10    0x80  0x2cd7be  0x0002  0x0d97.0fbdee68  0x00000000  0x0000.000.00000000  0x00000000   0x00000000  0
   0x34    9    0x00  0x2d182d  0x0013  0x0d97.0fafa64e  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901737
   0x35    9    0x00  0x2d184c  0x003b  0x0d97.0fade00c  0xa5c64379  0x0000.000.00000000  0x00000001   0x00000000  1471901720



ANBOB@SQL> SELECT * FROM x$ktuxe WHERE  ktuxeusn=2880 AND INDX=51;

ADDR                   INDX    INST_ID   KTUXEUSN   KTUXESLT   KTUXESQN  KTUXERDBF  KTUXERDBB  KTUXESCNB  KTUXESCNW KTUXESTA        
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------------
00000001109CDD30         51          1       2880         51    2938814          0          0  264105576       3479 ACTIVE          

 KTUXECFL                   KTUXEUEL  KTUXEDDBF  KTUXEDDBB  KTUXEPUSN  KTUXEPSLT  KTUXEPSQN   KTUXESIZ
 ------------------------ ---------- ---------- ---------- ---------- ---------- ---------- ----------
 NONE                              2          0          0          0          0          0          0 

NOTE:
注意到UNDO段头上记录的记录确为ACTIVE状态,但是找事务的最后一个UNDO block DBA时发现是0, 而且cmt 也为0, 关于undo seg的意义不在本篇笔记的范围后续会写,
结束没法找,下面尝试dump 事务开始时的undo block.

 ##### dump strat undo block
 
 ANBOB@SQL> alter system dump datafile 663 block 410489;
System altered.

ANBOB@SQL> oradebug setmypid
Statement processed.
 
ANBOB@SQL> oradebug tracefile_name
/oracle/app/oracle/diag/rdbms/acctd/acctd1/trace/acctd1_ora_40173614.trc
 
********************************************************************************
UNDO BLK:
xid: 0x0b40.033.002cd7be  seq: 0x5a29 cnt: 0x32  irb: 0x32  icl: 0x0   flg: 0x0000      <<<<<<<<<<# USN#  SLOT#   SEQ#   START_UBASQN

 Rec Offset      Rec Offset      Rec Offset      Rec Offset      Rec Offset
---------------------------------------------------------------------------
0x01 0x3f60     0x02 0x3ed8     0x03 0x3e00     0x04 0x3d48     0x05 0x3c70
0x06 0x3be8     0x07 0x3b1c     0x08 0x3a94     0x09 0x39bc     0x0a 0x3934
0x0b 0x3870     0x0c 0x37c0     0x0d 0x3738     0x0e 0x36b0     0x0f 0x35d8
0x10 0x3520     0x11 0x3470     0x12 0x33c0     0x13 0x3310     0x14 0x324c
0x15 0x3194     0x16 0x30d0     0x17 0x3020     0x18 0x2f98     0x19 0x2f10
0x1a 0x2e38     0x1b 0x2db0     0x1c 0x2d00     0x1d 0x2c4c     0x1e 0x2b74
0x1f 0x2aec     0x20 0x2a64     0x21 0x29dc     0x22 0x2954     0x23 0x28cc
0x24 0x27f4     0x25 0x276c     0x26 0x26e4     0x27 0x265c     0x28 0x25ac
0x29 0x2524     0x2a 0x2460     0x2b 0x23d8     0x2c 0x2320     0x2d 0x224c
0x2e 0x21c4     0x2f 0x213c     0x30 0x2088     0x31 0x2000     0x32 0x1fb0           <<<<<<# start_rec


*-----------------------------
* Rec #0x31  slt: 0x31  objn: 2171677(0x0021231d)  objd: 2931186  tblspc: 27(0x0000001b)
*       Layer:  11 (Row)   opc: 1   rci 0x00
Undo type:  Regular undo    Begin trans    Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000Ext idx: 0
flg2: 0
*-----------------------------
uba: 0xa5c64379.5a29.30 ctl max scn: 0x0d97.0f6f6c77 prv tx scn: 0x0d97.0f705634
txn start scn: scn: 0x0d97.0fbd52aa logon user: 389
 prev brb: 2781234038 prev bcl: 0
KDO undo record:
KTB Redo
op: 0x04  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x0b38.00f.00260322 uba: 0xa5c63636.5afb.4f
                      flg: C---    lkc:  0     scn: 0x0d97.0fbd5270
KDO Op code: LKR row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x704ab721  hdba: 0x704ab70a
itli: 12  ispac: 0  maxfr: 9774
tabn: 0 slot: 14 to: 0

# find 0x32 
*-----------------------------
* Rec #0x32  slt: 0x33  objn: 0(0x00000000)  objd: 0  tblspc: 0(0x00000000) <<<<
*       Layer:   5 (Transaction Undo)   opc: 7   rci 0x00
Undo type:  Regular undo    Begin trans    Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000Ext idx: 0
flg2: 0
*-----------------------------
uba: 0xa5c64379.5a29.31 ctl max scn: 0x0d97.0f705634 prv tx scn: 0x0d97.0f720e6f
txn start scn: scn: 0x0d97.0fbdee68 logon user: 389
 prev brb: 2781234038 prev bcl: 0

2016-09-02 16:49:19.961415 : kjbmbassert [0x64379.297]
2016-09-02 16:49:19.961557 : kjbmsassert(0x64379.297)(2)
End dump data blocks tsn: 1 file#: 663 minblk 410489 maxblk 410489

NOTE:
从该事务开始的undo block中undo 记录0x32(50)中发现只是一个transaction undo 类型objn 0 没有相关的对象,也没有DML的UNDO记录, 继续查看活动事务对应的session对锁的持有情况.

 ANBOB@SQL> select * from v$lock where sid=3331;

ADDR             KADDR                   SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
0700000A21ED9FD8 0700000A21EDA030       3331 AE    2498633          0          4          0    1143242          2
0700000A0D6B1308 0700000A0D6B1380       3331 TX  188743731    2938814          6          0    1142147          2

ANBOB@SQL> select b.username,a.sid, trunc(a.id1/power(2,16)) rbs,bitand(a.id1,to_number('ffff','xxxx'))+0 slot,
a.id2 seq,a.lmode,a.request, a.block 
from v$lock a,v$session b 
where a.type = 'TX' and a.sid = b.sid and b.sid=3331;

USERNAME                              SID        RBS       SLOT        SEQ      LMODE    REQUEST      BLOCK
------------------------------ ---------- ---------- ---------- ---------- ---------- ---------- ----------
ACCOUNT                              3331       2880         51    2938814          6          0          2

Note:
该会话持有一个TX LMODE 6的锁,block=2 全局锁。当前库无分布式事务锁。

ANBOB@SQL> select * from dba_2pc_pending;
no rows selected

其实这种现象是有分布式事务引起的如通过DBLINK的SQL,即使只是一个通过DATABASE LINK的的一个查询,数据库也会在本地使用一个UNDO block来标志这个分布式事务,并且持有一个TX lock, 因为本地节点不能知道远程节点的UNDO情况,所有undo段头上DBA为0 .

解决方法有两种
1,commit, rollback, logout 即使是一个select 语句通过DBLINK,这里不再演示。
2,如果你能确认通过DBLINK只是做查询,那样可以在事务开始set transaction read only; 然后再通过dblink 查询就不会产生这种长事务,下面演示这种。

SYS@SQL> set transaction read only;
Transaction set.

SYS@SQL> select sysdate from dual@REPORT;
SYSDATE
-----------------
20160911 18:47:29

ANBOB@SQL> SELECT t.xidusn "RBS Name",
              t.used_ublk "Used Blocks",
              s.username "User Name"
                     ,t.start_time
       FROM v$transaction t, 
            v$session s
       WHERE t.ses_addr=s.saddr 
      and s.sid=2496
       /

  RBS Name Used Blocks User Name                      START_TIME
---------- ----------- ------------------------------ --------------------
      4330           1 SYS                            09/11/16 18:47:23

NOTE:
oops!, 可以看到还是分配1个undo block, 开始我以为是版本有问题,测了几个版本依然存在,终于还是从官方文档找到了原因。

This clause is not supported for the user SYS. Queries by SYS will return changes made during the transaction even if SYS has set the transaction to be READ ONLY.

ANBOB@SQL> conn system
Enter password:
Connected.
USERNAME INST_NAME HOST_NAME SID SERIAL# VERSION STARTED SPID OPID CPID SADDR PADDR
-------------------- -------------------- ------------------------- ----- -------- ---------- -------- --------------- ----- --------------- ---------------- ----------------
SYSTEM anbob1 weejar1 2581 44169 11.2.0.3.0 20160130 30540308 91 30474778 0700000A018577E0 0700000A191A7728

SYSTEM@SQL> select sysdate from dual@REPORT;
SYSDATE
-----------------
20160913 10:26:46

ANBOB@SQL> SELECT t.xidusn "RBS Name",
2 t.used_ublk "Used Blocks",
3 s.username "User Name"
4 ,t.start_time
5 FROM v$transaction t,
6 v$session s
7 WHERE t.ses_addr=s.saddr
8 and s.sid=2581;

RBS Name Used Blocks User Name START_TIME
---------- ----------- ------------------------------ --------------------
2975 1 SYSTEM 09/13/16 10:26:40

SYSTEM@SQL> commit;
Commit complete.

SYSTEM@SQL> set transaction read only;
Transaction set.

SYSTEM@SQL> select sysdate from dual@REPORT;
SYSDATE
-----------------
20160913 10:28:27

SYSTEM@SQL> SELECT t.xidusn "RBS Name",
2 t.used_ublk "Used Blocks",
3 s.username "User Name"
4 ,t.start_time
5 FROM v$transaction t,
6 v$session s
7 WHERE t.ses_addr=s.saddr
8 and s.sid=2581;
no rows selected

关于DBLINK的使用脚本可以看这里Script: Who’s using a database link?(找出谁在使用dblink)

Summary:
分布式事务即使是通过DBLINK的select也会分配一个undo block,并持有tx lock的事务,如果该会话最后停留这样的事务,就会导致长事务一直不释放,从而影响OGG, 对于这样的事务要及时COMMIT或rollback, 或在对于只查询的DBLINK事务开始前使用set transaction read only; 这个配置不影响SYS用户。

打赏

对不起,这篇文章暂时关闭评论。