首页 » ORACLE » Logon Storm and Memory shortage issue ORA-07445 dbgtrReserveSpace (记录一次登录风暴致内存不足的问题)

Logon Storm and Memory shortage issue ORA-07445 dbgtrReserveSpace (记录一次登录风暴致内存不足的问题)

记录一次内存不足使用SWAP 严重时的数据库表现。 oracle 11203 on linux 64
cpu core 8,物理内存:8g,问题当时物理内存使用完,swap 已使用4G,有配置hugepage,且 vmstat 查看swap in out 严重。负载60左右

Check oracle alert

2013-10-19 08:51:10.384000 +08:00
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x10] [PC:0x938E8C5, dbgtrReserveSpace()+99] [flags: 0x0, count: 1]
Errors in file /oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_13864.trc  (incident=60801):
ORA-07445: exception encountered: core dump [dbgtrReserveSpace()+99] [SIGSEGV] [ADDR:0x10] [PC:0x938E8C5] [Address not mapped to object] []
Incident details in: /oracle/diag/rdbms/ora11g/ora11g/incident/incdir_60801/ora11g_ora_13864_i60801.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2013-10-19 08:52:14.986000 +08:00
Sweep [inc][60837]: completed
Sweep [inc][60836]: completed
Sweep [inc][60835]: completed
Sweep [inc][60834]: completed
Sweep [inc][60833]: completed
...
Sweep [inc][60839]: completed
Sweep [inc][60838]: completed
2013-10-19 08:53:24.174000 +08:00
Errors in file /oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_14250.trc  (incident=47873):
ORA-00600: internal error code, arguments: [17147], [0x15A2005D0], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/ora11g/ora11g/incident/incdir_47873/ora11g_ora_14250_i47873.trc
...
2013-10-19 09:10:01.732000 +08:00
WARNING: Heavy swapping observed on system in last 5 mins.
2013-10-19 09:10:13.915000 +08:00
pct of memory swapped in [8.73%] pct of memory swapped out [4.84%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.

Incident details

vi /oracle/diag/rdbms/ora11g/ora11g/incident/incdir_60801/ora11g_ora_13864_i60801.trc

Dump continued from file: /oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_13864.trc
ORA-07445: exception encountered: core dump [dbgtrReserveSpace()+99] [SIGSEGV] [ADDR:0x10] [PC:0x938E8C5] [Address not mapped to object] []
========= Dump for incident 60801 (ORA 7445 [dbgtrReserveSpace()+99]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x10] [PC:0x938E8C5, dbgtrReserveSpace()+99] [flags: 0x0, count: 1]
Registers:
%rax: 0x00000000770000e4 %rbx: 0x0000000000000068 %rcx: 0x0000000000000068
%rdx: 0x00000001658f10c0 %rdi: 0x00002b25a44ed710 %rsi: 0x00007fff08087ae8
%rsp: 0x00007fff08087930 %rbp: 0x00007fff080879b0  %r8: 0x0000000000000068
 %r9: 0x0000000000000001 %r10: 0x0000000000000068 %r11: 0x0000000000000000
%r12: 0x0000000000000000 %r13: 0x00007fff08087ae8 %r14: 0x00000001658f10c0
%r15: 0x0000000000000001 %rip: 0x000000000938e8c5 %efl: 0x0000000000010206
  dbgtrReserveSpace()+83 (0x938e8b5) jnz 0x938f18c
  dbgtrReserveSpace()+89 (0x938e8bb) mov (%r14),%r12
  dbgtrReserveSpace()+92 (0x938e8be) or $64,%eax
  dbgtrReserveSpace()+95 (0x938e8c1) mov %eax,0x8(%r14)
> dbgtrReserveSpace()+99 (0x938e8c5) mov 0x10(%r12),%r8
  dbgtrReserveSpace()+104 (0x938e8ca) mov 0x18(%r12),%r9
  dbgtrReserveSpace()+109 (0x938e8cf) mov %r9,%rax
  dbgtrReserveSpace()+112 (0x938e8d2) sub %r8,%rax
  dbgtrReserveSpace()+115 (0x938e8d5) movzwl %ax,%eax
  ----- Call Stack Trace -----
skdstdst()+36        call     kgdsdst()            
ksedst1()+98         call     skdstdst()           
ksedst()+34          call     ksedst1()            
dbkedDefDump()+2741  call     ksedst()             
ksedmp()+36          call     dbkedDefDump()       
ssexhd()+2458        call     ksedmp()             
__sighandler()       call     ssexhd()             
dbgtrReserveSpace()  signal   __sighandler()       
+99                                                
dbgtTrcData_int()+3  call     dbgtrReserveSpace()  
18                                                 
kstipg()+740         call     dbgtTrcData_int()    
dbktbDefaultBucketA  call     kstipg()             
lloc()+87                                          
ksucrp()+1470        call     dbktbDefaultBucketA  
                              lloc()               
opiino()+1477        call     ksucrp()             
opiodr()+916         call     opiino()             
opidrv()+565         call     opiodr()             
sou2o()+98           call     opidrv()             
opimai_real()+128    call     sou2o()              
ssthrdmain()+252     call     opimai_real()        
main()+196

Analyze listener.ora log

[oracle@dbserver89 trace]$ fgrep "19-OCT-2013 08:" listener.log_bak|fgrep "establish"|awk '{print $1 "" $2}'| awk -F: '{print $1 ":" $2}' |sort|uniq -c
...
114 19-OCT-201308:40
     83 19-OCT-201308:41
     57 19-OCT-201308:42
     13 19-OCT-201308:43
     84 19-OCT-201308:44
     93 19-OCT-201308:45
     95 19-OCT-201308:46
     69 19-OCT-201308:47
     71 19-OCT-201308:48
     65 19-OCT-201308:49
     52 19-OCT-201308:50
    127 19-OCT-201308:51
     39 19-OCT-201308:52
     99 19-OCT-201308:53
     39 19-OCT-201308:54
    338 19-OCT-201308:55
    785 19-OCT-201308:56
   1229 19-OCT-201308:57
   1122 19-OCT-201308:58
   1190 19-OCT-201308:59
   ...
[root@dbserver89 sa]# sar -f sa19  -s 08:00:00 -e 10:00:00
Linux 2.6.18-92.el5 (dbserver89)        10/19/2013

08:00:01 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
08:10:01 AM       all      0.13      0.00      0.09      0.11      0.00     99.67
08:20:02 AM       all      0.17      0.00      0.11      0.15      0.00     99.58
08:30:01 AM       all      0.12      0.00      0.09      0.25      0.00     99.54
08:40:01 AM       all      0.25      0.00      0.13      0.54      0.00     99.09
08:50:01 AM       all      0.54      0.00      0.17      0.84      0.00     98.45
09:00:01 AM       all      0.84      0.00      0.40      0.66      0.00     98.11
09:10:53 AM       all      0.80      0.00      8.02     42.91      0.00     48.27
09:20:05 AM       all      0.50      0.00      8.24     32.31      0.00     58.95
09:30:06 AM       all      0.69      0.00      5.29     27.21      0.00     66.81
09:40:02 AM       all      0.59      0.00      7.75     27.87      0.00     63.79
09:50:01 AM       all      0.33      0.00      0.31      1.40      0.00     97.96
Average:          all      0.45      0.00      2.78     12.28      0.00     84.49

Logon Storm

A Logon storm is a sudden spike in the number of client connection requests. Logon storms can occur due to a variety of factors. They could be malicious like a DoS attack. Or they could occur due to administrative actions – such as a middle tier coming online.
The process of starting a database session has inherent CPU costs associated with it – from the creation of a new OS process to attaching to the SGA. Every system has a limit above which the spawning of new database sessions can starve existing sessions of CPU resources. This could result in many unexpected consequences including application timeouts, increased response times, un-responsiveness and other cascading effects.

同时oracle 11g 默认是启用了 db audit 审计。
如果查询那个时间段的事件可以用下面的方法

  SELECT snap_id,
         sample_time,
         session_type,
         session_id,sql_opname,
         sql_id,
         event,
         p1,
         p2,
         blocking_session,
         blocking_hangchain_info,
         current_obj#,
         machine
    FROM DBA_HIST_ACTIVE_SESS_HISTORY
   WHERE sample_time BETWEEN TO_DATE ('2013-10-19 08:00', 'yyyy-mm-dd hh24:mi')
                         AND TO_DATE ('2013-10-19 10:00', 'yyyy-mm-dd hh24:mi')
ORDER BY 1,

会发现很多buffer busy wait 对象是aud$. 同时伴有resmgr:cpu quantum

打赏

对不起,这篇文章暂时关闭评论。