首页 » ORACLE 9i-23c » 再续PGA 和max_map_count、_realfree_heap_pagesize

再续PGA 和max_map_count、_realfree_heap_pagesize

之前分享过《PGA 一点认识》《Know more about PGA_AGGREGATE_LIMIT 12c 19c》 看到限制PGA的参数有_PGA_MAX_SIZE、 _SMM_(PX)_MAX_SIZE、PGA_AGGREGATE_TARGET、PGA_AGGREGATE_LIMIT ,为了提高SQL性能手动增加_PGA_MAX_SIZE参数值(The hidden parameter _pga_max_size does not limit a process size, only the work area),如在数据仓库环境可能更明显,最近看到一个案例在执行存储过程时,当把参数_PGA_MAX_SIZE 调到4GB以上时(default 2GB),依旧遇到了ora-4030 错误, trace 文件中显示上限达4GB的上限,Process Map Dump 达65537行。

在MOS中查找不难发现PLSQL Procedure Causing ORA-04030: (pga heap,control file i/o buffer) And ORA-04030: (koh-kghu sessi,pmuccst: adt/record) or ORA-04030: (koh-kghucall ,pmucalm coll) Errors (Doc ID 1325100.1) 匹配,提到了解决方案有2种
1,增加操作系统级page map的个数

more /proc/sys/vm/max_map_count
sysctl -w vm.max_map_count=262144 (for example)   -- add /etc/sysctl.conf
-- or --

2, 增加数据库级realfree heap pagesize大小,需要重启数据库
For versions 11.2.0.4 and lower:

_use_realfree_heap=TRUE
_realfree_heap_pagesize_hint = 262144

For 12.1 and higher:

_use_realfree_heap=TRUE
_realfree_heap_pagesize = 262144

检查默认配置

[oracle@oel7db1 ~]$ sysctl -a|grep max_map
vm.max_map_count = 65530

SQL> @pd realfree
Show all parameters and session values from x$ksppi/x$ksppcv...

       NUM N_HEX NAME                                                     VALUE                          DESCRIPTION
---------- ----- -------------------------------------------------------- ------------------------------ ---------------------------------------------------------------------------------------------------
       192    C0 _realfree_heap_max_size                                  32768                          minimum max total heap size, in Kbytes
       193    C1 _realfree_heap_pagesize                                  65536                          hint for real-free page size in bytes
       194    C2 _realfree_pq_heap_pagesize                               65536                          hint for pq real-free page size in bytes
       195    C3 _realfree_heap_mode                                      0                              mode flags for real-free heap
       196    C4 _use_realfree_heap                                       TRUE                           use real-free based allocator for PGA memory

[oracle@oel7db1 ~]$ echo 65536*65530/1024/1024|bc
4095

默认数据库realfree allocator pagesize 是 64 kB (65536), 操作系统vm.max_map_count默认是65530, 64*65530=4GB. 如果把realfree heap pagesize增加到256kB (262144) 上限就成了16GB.对于多租户修改CDB级参数即可,PDB继承。

解决问题一方面增加单个进程PGA上限,另一方面要检查为什么单个进程PGA使用这么高?是否是合理?当发现有些进程使用PGA或内存使用较高时,如11g, 12, 19c都存在的bug ASM的dia0/diag 进程内存较大时,检查方法:

column name format a25
column pname format a12
column "MegaBytes" format a10
set numwidth 6

select ss.sid, p.pid, p.pname, sn.name, round(ss.value/(1024 *1024))||'Mb' "MegaBytes"
from v$sesstat ss, v$statname sn, v$session s, v$process p
where s.paddr = p.addr
and sn.statistic# = ss.statistic#
and s.sid = ss.sid
and sn.name in ('session pga memory' , 'session pga memory max')
and p.pname='DIA0'
order by ss.value
/

select s.sid, s.program, p.pga_used_mem, p.pga_alloc_mem 
               from v$session s, v$process p
              where s.paddr=p.addr and p.pga_used_mem >xxx;

oradebug setorapid xxx
oradebug unlimit
oradebug dump heapdump 1
oradebug tracefile_name

-- format trace file using heap.awk

查找内存使用相关的SQL和heap名称,查询是否有bug? 如上面提到的bug  diag/dia进程可以kill 自动重启进程释放内存。

在Linux系统中可以使用pmap查看Process Map 记录数,但是当我在RAC on  RHEL 7环境使用oracle用户用pmap查看server process(LOCAL=NO),提示无权限。如果使用strace跟踪是在read  /proc/PID/maps文件时. 查看文件系统权限时:

$ ls -l /proc/NNNN/
-r--r--r-- 1 root root 0 Aug  9 16:06 maps
-rw------- 1 root root 0 Aug  9 16:06 mem
...
-r--r--r-- 1 root root 0 Aug  9 16:06 smaps
-r--r--r-- 1 root root 0 Aug  9 16:06 stack
-r--r--r-- 1 root root 0 Aug  9 16:06 stat
-r--r--r-- 1 root root 0 Aug  9 16:06 statm
-r--r--r-- 1 root root 0 Aug  9 16:06 status

注意是-r-r-r的权限,单从文件权限看other是有read权限的,使用oracle用户cat maps文件无权限,如果尝cat status文件是可以,后来从官方《Permission denied when reading /proc/$PID/* owned by the user》 和stackoverflow 找到了答案。

It’s because the file permissions are not the only protection you’re encountering.

Those aren’t just regular text files on a file system, procfs is a window into process internals and you have to get past both the file permissions plus whatever other protections are in place.

The maps show potentially dangerous information about memory usage and where executable code is located within the process space. If you look into ASLR, you’ll see this was a method of preventing potential attackers from knowing where code was loaded and it wouldn’t make sense to reveal it in a world-readable entry in procfs.

This protection was added way back in 2007:

This change implements a check using “ptrace_may_attach” before allowing access to read the maps contents. To control this protection, the new knob /proc/sys/kernel/maps_protect has been added, with corresponding updates to the procfs documentation.

Within ptrace_may_attach() (actually within one of the functions it calls) lies the following code:

if (((current->uid != task->euid) ||
(current->uid != task->suid) ||
(current->uid != task->uid) ||
(current->gid != task->egid) ||
(current->gid != task->sgid) ||
(current->gid != task->gid)) && !capable(CAP_SYS_PTRACE))
return -EPERM;
so that, unless you have the same real user/group ID, saved user/group ID and effective user/group ID (i.e., no sneaky setuid stuff) and they’re the same as the user/group ID that owns the process, you’re not allowed to see inside that “file” (unless your process has the CAP_SYS_PTRACE capability of course).

文件权限只是控制文件读写的一方面,对于/proc文件系统中文件,出于安全原因,从其他进程内存读取有额外的权限检查, /proc/$PID/maps显示了有关内存使用情况,以及可执行代码在进程空间中的位置的潜在危险信息, 操作系统在内核级使用CAP(Capabilities)机制控制,对于内核调用时可以分类分别控制,如果文件没有相同的UID/GID,还要确保使用 $PID 的进程对核心转储功能没有限制,可以临时启用审计规则auditctl 调测。简而言之就是使用pmap跟踪程序时,尝试read  /proc/PID的文件时需要CAP_SYS_PTRACE Capabilities, oracle读取root owner的maps文件,可以通过以下方式授权:

su - root
# setcap cap_sys_ptrace=eip /usr/bin/pmap

OOM KILL
In Linux oom_kill.c, out_of_memory() calls select_bad_process() to find processes to be killed.
If found, kill by oom_kill_process().
If not found, panic the system (halt the system, never return) by:
panic(“Out of memory and no killable processes…\n”);

TEST CODE

ksun’s SQL Scripts to test PGA

create or replace package pga_mem_test as
  procedure allo (p_mb int, p_sleep number := 0);
end;
/

create or replace package body pga_mem_test as
  type t_tab_kb   is table of char(1024);   -- 1KB
  p_tab_1mb          t_tab_kb := t_tab_kb();
  type t_tab_mb   is table of t_tab_kb;     
  p_tab_mb           t_tab_mb := t_tab_mb();
  p_sid              number   := sys.dbms_support.mysid;
  
  -------------------------------------------
  procedure rpt(l_name varchar) is
     l_v$process_mem            varchar2(4000);
     l_v$process_memory_mem     varchar2(4000);
  begin
   select 'Used/Alloc/Freeable/Max >>> '||
           round(pga_used_mem/1024/1024)    ||'/'||round(pga_alloc_mem/1024/1024)||'/'||
             round(pga_freeable_mem/1024/1024)||'/'||round(pga_max_mem/1024/1024)
       into l_v$process_mem
       from v$process 
       where addr = (select paddr from v$session where sid = p_sid);
      
    select 'Category(Alloc/Used/Max) >>> '||
             listagg(Category||'('||round(allocated/1024/1024)||'/'||
                     round(used/1024/1024)||'/'||round(max_allocated/1024/1024)||') > ')
     within group (order by Category desc) name_usage_list
       into l_v$process_memory_mem
       from v$process_memory
       where pid = (select pid from v$process
                     where addr = (select paddr from v$session where sid = p_sid));
  
    dbms_output.put_line(rpad(l_name, 20)||' > '||rpad(l_v$process_mem, 50));
    dbms_output.put_line('             ------ '||l_v$process_memory_mem);
  end rpt;
   
  -------------------------------------------   
  procedure allo (p_mb int, p_sleep number) is
  begin
   rpt('Start allocate: '||p_mb||' MB');
   
   select 'M' bulk collect into p_tab_1mb from dual connect by level <= 1024;  -- 1MB
   
   for i in 1..p_mb loop   -- p_mb MB
    p_tab_mb.extend;
    p_tab_mb(i) := p_tab_1mb;
   end loop;
  
   rpt('End allocate: '||p_mb||' MB');
   dbms_lock.sleep(p_sleep);
  end allo;
  
end;
/

/*
   exec dbms_session.reset_package;
   set  serveroutput on
   exec pga_mem_test.allo(1024*1, 30);       -- allocate 1GB
*/

------------------------------------------- 
create or replace procedure pga_mem_test_jobs(p_job_cnt number, p_mb number, p_sleep number := 0)
as
   l_job_id pls_integer;
begin
    for i in 1.. p_job_cnt loop
      dbms_job.submit(l_job_id, 'begin pga_mem_test.allo('||p_mb||', '||p_sleep||'); end;');
    end loop;
    commit;
end;    
/

--exec pga_mem_test_jobs(4, 1024*2, 60);   -- 4 Jobs, each allocates 2 GB, sleeping 60 seconds
打赏

, ,

对不起,这篇文章暂时关闭评论。