Troubleshooting high “gc current grant 2-way” and “gc cr grant 2-way” 案例(二)

很久以前遇到过一个案例，也是一套RAC环境出现了非常高的GC cr/current grant 2-way 等待事件，《Performance tuning ‘gc cr&current grant 2-way’ event (当主机扩容cpu后)》当是这个案例是因为同一cluster的两个实例主机不同的CPU个数导致出现了不同的LMS进程数，负载能力不均衡导致，这次要分享的案例是另个一种特殊情况。

持续较高的GC类等待，

NODE1

Event	Waits	Time(s)	Avg wait (ms)	% DB time	Wait Class
enq: TX – index contention	27,057	32,531	1202	25.09	Concurrency
DB CPU		26,320		20.30
enq: TX – row lock contention	165,277	9,237	56	7.12	Application
gc current grant 2-way	61,769	9,208	149	7.10	Cluster
log file sync	1,937,689	6,747	3	5.20	Commit

NODE2

Top 5 Timed Foreground Events

Event	Waits	Time(s)	Avg wait (ms)	% DB time	Wait Class
gc buffer busy acquire	84,112	31,365	373	35.95	Cluster
gc current grant 2-way	24,906	11,217	450	12.86	Cluster
gc current block 2-way	71,984	8,190	114	9.39	Cluster
gc cr grant 2-way	34,748	6,061	174	6.95	Cluster
enq: TX – index contention	3,067	4,735	1544	5.43	Concurrency

如果从节点1看会认为是并发争用enq index/row lock contention；还有gc cr/current grant 2-way, 该event表示Block is not in any cache， Permission granted to read from the disk. grant是message, block是data。

在诊断RAC问题时有几个黄金规则：

1. 谨防被TOP Event所欺骗
在RAC环境中有可能根本问题在另一个节点中隐藏，基于不会显示在top 5 event中, 被影响的等待事件可能比根本原因等待事件本身还要高，实例的性能错综复杂，很可能一个节点的后台进程出现问题（如LMS），而影响了其它所有节点出现Application class等待事件。

gc buffer busy xx 等待事件是RAC性能分析根本问题之前的主要证状，简单认为在等待一个BLOCK的BL lock请求, 其它所有进程在等待这个BLOCK的BL lock可用其间的时间消耗都会记入gc buffer busy xx 等待事件中.

ASH和dump block trace 文件可以定位GC BUFFER BUSY等待，主要是识别对象，对象类型，块类型信息

Object type	Block type	Possible issue(s)
Table	Segment header	Freelists, freelist groups, deletes from one node,ASSM bugs etc
Table	Segment blocks	Heavy insert workload, heavy update to few blocks, SQL performance issues scanning few objects aggressively.
Index	Leaf block	Numerous inserts on indexed columns, sequence or sysdate generated keys
Index	Root / branch block	Insert + numerous index scans, concurrent inserts and deletes etc
Undo	Undo header block	Numerous short transactions
Undo	Undo block	Long pending transaction and CR block generation
Dictionary	SEQ$ (object)	Sequences with nocache or order set and excessive access to sequence.

2. 消除基础架构环境问题

Gc cr grants 2-way 如果BLOCK没有在任何实例的Buffer cache中，时间会记入这个等待事件中，前台进程发送一个请求给远程LMS, 远程的LMS回复‘read from disk’. 出现gc 2-way 3-way这类意为着高并发出现的拥挤产生的等待，一个block和message传输的最小时间通常<2ms, (Exadata<0.5ms). 需要查看 AWR中的wait event histogram 查看2/3- way事件不同时间段的占比。排除网络问题

3. 识别问题实例

SELECT INSTANCE ||'->’ || inst_id transfer,
class, cr_block cr_blk,
Trunc(cr_block_time / cr_block / 1000, 2) avg_Cr,
current_block cur_blk,
Trunc(current_block_time / current_block / 1000, 2) avg_cur
FROM gv$instance_cache_transfer
WHERE cr_block > 0 AND current_block > 0
ORDER BY INSTANCE, inst_id, class
/

4. 检查发送端的性能指标，不只是接收端
发送端
gc cr block receive time=
Time to send message to a remote LMS process by FG
+ Time taken by LMS to build block (statistics: gc cr block build time)
+ LMS wait for LGWR latency ( statistics: gc cr block flush time)
+ LMS send time (Statistics: gc cr block send time)
+ Wire latency.

CR block request time = build time + flush time + send time

CR block request time is the time it takes to build the CR block in an instance that owns the appropriate image and the time to flush it, you have to write to disk , and how long it takes to send it across theinterconnect.

Current block request time = pin time + flush time + send time

Current block request time is how long it takes to pin the image in an instance that owns the block image and time it takes to flush it and send it across, because you cannot send it while some is changing that block at the same time. That is why you need to pin the block in exclusive mode then flush it and send it over the interconnect.

5. 用柱状图去识别异常情况，而不只是平均值

AWR中的wait event histogram

案例分析

接着分析一开始AAS 上显示GC wait问题案例，这是一个11.2.0.3 2-NODES RAC ON HPUX 11.31的数据库，

# instance 1

Global Cache Load Profile

	Per Second	Per Transaction
Global Cache blocks received:	1,018.74	1.08
Global Cache blocks served:	78.53	0.08
GCS/GES messages received:	904.33	0.96
GCS/GES messages sent:	2,532.56	2.68
DBWR Fusion writes:	3.07	0.00
Estd Interconnect traffic (KB)	18,227.67

Global Cache Efficiency Percentages (Target local+remote 100%)

Buffer access – local cache %:	99.48
Buffer access – remote cache %:	0.39
Buffer access – disk %:	0.14

Global Cache and Enqueue Services – Workload Characteristics

Avg global enqueue get time (ms):	2.9
Avg global cache cr block receive time (ms):	6.8
Avg global cache current block receive time (ms):	3.8
Avg global cache cr block build time (ms):	0.0
Avg global cache cr block send time (ms):	0.0
Global cache log flushes for cr blocks served %:	0.5
Avg global cache cr block flush time (ms):	29.3
Avg global cache current block pin time (ms):	1.4
Avg global cache current block send time (ms):	0.0
Global cache log flushes for current blocks served %:	2.1
Avg global cache current block flush time (ms):	30.8

Global Cache and Enqueue Services – Messaging Statistics

Avg message sent queue time (ms):	250.4 –issue instance
Avg message sent queue time on ksxp (ms):	6.8
Avg message received queue time (ms):	3.0
Avg GCS message process time (ms):	1.6
Avg GES message process time (ms):	0.0
% of direct sent messages:	43.52
% of indirect sent messages:	45.21
% of flow controlled messages:	11.27

TIPS:
AWR中的Average message queue time 数据来自 (G)V$GES_STATISTICS. 该值是把indirect message到message队列和发送它所需的时间(GCS/GES communication layer). 我们需要查看KSXP消息统计时间，以确定IPC层中是否存在问题。 IPC时间通常<1ms，您可以查询X$KSXPCLIENT以获取更多数据，该值很高时也有可能在这段时间内CPU非常繁忙（100％CPU）或者 interconnect达到顶峰, 这点需要结合OSW确认。

# instance 2

Global Cache Load Profile

	Per Second	Per Transaction
Global Cache blocks received:	96.24	0.75
Global Cache blocks served:	1,043.19	8.17
GCS/GES messages received:	3,607.20	28.26
GCS/GES messages sent:	1,342.88	10.52
DBWR Fusion writes:	3.09	0.02
Estd Interconnect traffic (KB)	19,197.66

Global Cache Efficiency Percentages (Target local+remote 100%)

Buffer access – local cache %:	99.43
Buffer access – remote cache %:	0.20
Buffer access – disk %:	0.37

Global Cache and Enqueue Services – Workload Characteristics

Avg global enqueue get time (ms):	3.5
Avg global cache cr block receive time (ms):	509.7
Avg global cache current block receive time (ms):	96.6
Avg global cache cr block build time (ms):	0.0
Avg global cache cr block send time (ms):	0.0
Global cache log flushes for cr blocks served %:	1.0
Avg global cache cr block flush time (ms):	1.3
Avg global cache current block pin time (ms):	0.2
Avg global cache current block send time (ms):	0.0
Global cache log flushes for current blocks served %:	0.1
Avg global cache current block flush time (ms):	1.7

Global Cache and Enqueue Services – Messaging Statistics

Avg message sent queue time (ms):	0.0
Avg message sent queue time on ksxp (ms):	142.6
Avg message received queue time (ms):	0.0
Avg GCS message process time (ms):	0.0
Avg GES message process time (ms):	0.0
% of direct sent messages:	73.94
% of indirect sent messages:	18.43
% of flow controlled messages:	7.63

Tips:

Avg global cache cr block receive time (ms) 从请求实例发送消息到mastering instance花费的时间。这个时间包括在holding instance生成数据块一致性读映像的时间。CR数据块获取耗费的时间不应该大于15ms。

Avg global cache current block receive time (ms): 从请求实例发送消息到mastering instance（2-way get）和一些到holding instance (3-way get)花费的时间。这个时间包括holding instance日志刷新花费的时间。Current Block获取耗费的时间不大于30ms.

Avg message sent queue time on ksxp (ms)对端IPC层确认收到的消息并返回ACK所花费的时间，通常是个很小的毫秒值，直接反应了网络延迟，一般小于1ms, The send time is measured when a block is handed to the IPC layer to be sent. The requests are queued in a completion queue, and the send time is measured from when a block is handed off to the IPC layer until the system call returns a completion, i.e. a status indicating that the send request has been successfully passed on to the network and control returns to the cache layer.

如果该值比较大，可能是CLUSTER INTERCONNECT和LMS进程出现了问题。
做ipc诊断：

 oradebug setmypid
 oradebug ipc

1）如果网络配置没有问题，可以检查一下TCP 和UDP的buffer是否不足
root#netstat -s | grep overflow –值增加=0一般情况下。
root#netstat -i

2) 查看负载ps aux | grep lms

查看是否是有某些表产生了过多的global cr request

 select  STATISTIC_NAME stat,
            OWNER,
            OBJECT_NAME obj,
            sum(value) val
       from v$segment_statistics
      where STATISTIC_NAME like 'global%'
        and value > 10000
      group by  STATISTIC_NAME,OWNER,OBJECT_NAME
      order by val desc;

2、查看一段时间内，CR的值是否很大。

select CR_REQUESTS cr,
          CURRENT_REQUESTS cur,
          DATA_REQUESTS data,
         UNDO_REQUESTS undo,
          TX_REQUESTS tx
     from v$cr_block_server;

确认问题节点

WAIT time	NODE1	NODE2
Avg global cache cr block receive time (ms)	6.8	509.7
Avg global cache current block receive time (ms)	3.8	96.6

Node:

RAC环境中，慢的节点会导致其他节点出现gc性能问题，根据这个逻辑，确认是节点1 出了问题。

#查看当时的LMS进程CPU使用

# node1
oracle@qdty1:/home/oracle:11G> ps -efl|head -n 1;ps -efl|grep lms
  F S      UID   PID  PPID  C PRI NI             ADDR   SZ            WCHAN    STIME TTY       TIME COMD
1401 S   oracle  5117     1 237 178 20 e000001c934ba340 94816 e000000792a7f050  Jul 13  ?        15776:29 ora_lms1_tyjc1
1401 S     grid 23810     1  0 178 20 e000001c9de0e040 90895 e000000792a7f1d0  Jan 25  ?        1550:30 asm_lms0_+ASM1
1401 R   oracle  5119     1 254 178 20 e000001ce7c1b9c0 94816                -  Jul 13  ?        16214:17 ora_lms2_tyjc1
401 S   oracle  9366  7730  0 154 20 e000001cdb304c40   30 e000000da0da6472 10:50:38 pts/4     0:00 grep lms
1401 R   oracle  5115     1 252 178 20 e000001cf7042a00 94752                -  Jul 13  ?        16329:36 ora_lms0_tyjc1

# node 2
oracle@qdty2:/home/oracle:11G> ps -efl|head -n 1;ps -efl|grep lms|grep -v grep
  F S      UID   PID  PPID  C PRI NI             ADDR   SZ            WCHAN    STIME TTY       TIME COMD
1401 S     grid 12516     1  0 178 20 e000001b43146400 90854 e000000792d71348  Jan 25  ?        1461:04 asm_lms0_+ASM2
1401 S   oracle  6446     1 24 178 20 e000001ef09b0d00 95666 e000000792d710d0  Jul 13  ?        16447:18 ora_lms2_tyjc2
1401 S   oracle  6444     1 30 178 20 e000001ebe6c7d00 95666 e000000792d71248  Jul 13  ?        17583:22 ora_lms1_tyjc2
1401 S   oracle  6442     1 43 178 20 e000001e7bc97c80 95666 e000000792d71240  Jul 13  ?        17764:57 ora_lms0_tyjc2

Note:
当前为HPUX 11.31平台，节点1所以PS显示的C[PU] 列会超过100%现象，但也表明了当前的CPU是非常繁忙。节点2是正常的。

# node1

Instance Activity Stats

Statistic	Total	per Second	per Trans
data blocks consistent reads – undo records applied	3.7 TB	1.1 GB	1.1 MB
rollback changes – undo records applied	4,074,774	1,124.49	1.19
transaction tables consistent reads – undo records applied	100.1 MB	28.3 KB	0 B

data blocks consistent reads – undo records applied

3.7 TB

1.1 GB

1.1 MB

rollback changes – undo records applied

4,074,774

1,124.49

1.19

transaction tables consistent reads – undo records applied

100.1 MB

28.3 KB

0 B

# node 2

Instance Activity Stats

Statistic	Total	per Second	per Trans
data blocks consistent reads – undo records applied	4.6 GB	2.7 MB	21.6 KB
rollback changes – undo records applied	1,043	0.59	0.00
transaction tables consistent reads – undo records applied	112.0 KB	0 B	0 B

data blocks consistent reads – undo records applied

4.6 GB

2.7 MB

21.6 KB

rollback changes – undo records applied

1,043

0.59

0.00

transaction tables consistent reads – undo records applied

112.0 KB

0 B

如果不是应用SQL和网络问题，那为什么LMS会占用那么高的cpu呢？去MOS中搜，不难发现，又命中了一个bug 14672057.

原因是因为从11.2.0.2版本开始为了解决之前的bug 9569316,而引入了_max_cr_rollbacks参数，该参数默认为0。表示最大cr block rolback blocks.没有找到过多的关于该参数的解释。如果当出现花费大量的时间在undo cr block生成时，把该参数调整为非0值，会启用light work 规则，有助于减少lms进程压力。该bug只影响11.2.0.2和11.2.0.3, 解决解决方法可以动态的修改 “_max_cr_rollbacks”参数解决，所以没有再提供one-off patch, 该bug 在11.2.0.4版本已修复，当执行完如下修改后，GC 等待和enq 瞬间消失，恢复了正常状态。所以问题不是节点1 top event中表现的index split问题。

SQL> alter system set "_max_cr_rollbacks"=200;
System altered.

# 修改后LMS 进程 node1

oracle@qdty1:/home/oracle:11G> ps -efl|head -n 1;ps -efl|grep lms|grep -v grep
  F S      UID   PID  PPID  C PRI NI             ADDR   SZ            WCHAN    STIME TTY       TIME COMD
1401 S   oracle  5117     1 55 178 20 e000001c934ba340 94816 e000000792a7f0d0  Jul 13  ?        15778:04 ora_lms1_tyjc1
1401 S     grid 23810     1  0 178 20 e000001c9de0e040 90895 e000000792a7f350  Jan 25  ?        1550:30 asm_lms0_+ASM1
1401 S   oracle  5119     1 76 178 20 e000001ce7c1b9c0 94816 e000000792a7f1c8  Jul 13  ?        16216:40 ora_lms2_tyjc1
1401 S   oracle  5115     1 80 178 20 e000001cf7042a00 94752 e000000792a7f3c0  Jul 13  ?        16331:29 ora_lms0_tyjc1