首页 » ORACLE » hugepage on linux

hugepage on linux

内存管理是一个非常复杂的结构,在操作系统中用Virtual memory System进行内存管理,pagetable是中记录了内存的虚拟地址和物理地址的映射关系,当一个进程真正的访问数据是首先访问的是page table然后转向真实的地址,cpu中有固定的大小来缓存部份的pagetables,这块区域就是translation lookaside buffer(TLB).pagetables的每条记录(PTE)记录了一些内存,有点像oracle中的index,为了提高访问速度,就需要在保存尽可能多的映射关系在TLB中。

[root@matrix 31510]# ps -ef|grep smon
oracle 439 1 0 13:51 ? 00:00:00 ora_smon_qinwen
root 14284 31510 0 15:54 pts/1 00:00:00 grep smon
[root@matrix 31510]# grep PTE /proc/31510/status
VmPTE: 72 kB

VM是通过pages管理的划分,在linux中每个page(chunk)默认4k,如果管理的大内存(大于8g,无明界限),映射关系条目数就会非常多,pagetables这个内存结构就会增大,当很多进程并行大量内存操作时,pagetables增加与交换而占用了保贵的内存和资源

在上次的一oracle性能问题中发现了它
[oracle@icme-db ~]$ cat /proc/meminfo
MemTotal: 16407000 kB
MemFree: 60188 kB
Buffers: 2300 kB
Cached: 7549028 kB
SwapCached: 65624 kB
Active: 10892792 kB
Inactive: 289248 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 16407000 kB
LowFree: 60188 kB
SwapTotal: 16386260 kB
SwapFree: 15750400 kB
Dirty: 28 kB
Writeback: 0 kB
Mapped: 10886892 kB
Slab: 147392 kB
CommitLimit: 24589760 kB
Committed_AS: 18295880 kB
PageTables: 4929120 kB
VmallocTotal: 536870911 kB
VmallocUsed: 270488 kB
VmallocChunk: 536600243 kB
HugePages_Total: 0
HugePages_Free: 0
Hugepagesize: 2048 kB

总共16g的内存,pagetables就已临近5G,另外也严重增加cpu的检索时间,加剧cpu负载,就应该考虑配置hugepage,加大page粒度,在当前版本的linux中默认hugepagesize是2M,hugepage内存是共享内存,会一直pin在内存中不会被swich出去, oracle中只能用于SGA,不能用于PGA,所以根据SGA来设置HUGEPAGES的大小,太大浪费略小sga还不会使用,因些OMS’notes 中有个脚本对于得到建议hugepage值,思路是在oracle平时使用时根据ipcs查看共享内存段的大小来判断,注意oracle11g的AMM不支持hugepage

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk {'print $2'}`
# Start from 1 pages to be on the safe side and guarantee 1 free HugePage
NUM_PG=1
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | awk {'print $5'} | grep "[0-9][0-9]*"`
do
   MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
   if [ $MIN_PG -gt 0 ]; then
      NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
   fi
done
# Finish with results
case $KERN in
   '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
          echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
   '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    *) echo "Unrecognized kernel version $KERN. Exiting." ;;
esac
# End

chmod u+x hugepages_settings.sh
./hugepages_settings.sh

配置HugePages

1,修改 /etc/security/limits.conf and 允许oracle用户锁到ram的内存数.参考大于hugepagesize*hugepages*1024

oracle soft memlock 41943030
oracle hard memlock 41943030

2,关闭oracle实例,注意在oracle running的情况下调整了hugepage sga并不会立及使用还有可能使instance因内存不足crash.

3, 修改/etc/sysctl.conf 分配给hugepages的大小,如SGA_MAX_SIZE=4G,hugepagesize 2M,
vm.nr_hugepages=2000

4, 使系统参数立即生效,run as root
sysctl -p

5, 启动oracle实例,观察hugepage使用HugePages_Rsvd或HugePages_Free (这点在rhel 4 和5上有区别)
cat /proc/meminfo

建议只要是oracle数据库就应该配置hugepage而并非之在32-bit

也可以参考
Metalink Note: 361323.1,744769.1, 748637.1
Christo Kutrovsky‘s article

打赏

对不起,这篇文章暂时关闭评论。