Instance Crash in 11.2.0.3 RAC and ORA-600 [kcbo_switch_cq_1]

Errors in file /oracle/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_pmon_15074000.trc:
ORA-00600: internal error code, arguments: [kcbo_switch_cq_1], [], [], [], [], [], [], [], [], [], [], []
PMON (ospid: 15074000): terminating the instance due to error 472

,

Crsd start fail and crsd.log show “Policy Engine is not initialized yet”& evmd.log show “[gipcretConnectionRefused] [29]”

最近一套数据库的2节点半夜突然crash,被1节点驱逐, AGENT 启动DB失败,手动重启CRS启动失败,后来发现日志中的现象与MOS中多篇bug很像但又不是,节点2CRS启动失败,AIX环境,crs日志显示”Policy Engine is not initialized yet”,evmd.log 显示”[gipcretConnectionRefused] [29]”

11.2.0.3 CRS start slow and cssd.log show ‘Msg-reply has soap fault 10’ 案例

修改PUBLIC IP应该就可以,但是应用前期连接数据库存在使用public IP的中件间,而且短时间内无法梳理并修改, 如何及解决CRS启动慢的问题又可以避免中间件或为中间件争取时间梳理? 下面是我的一种方案。

Oracle 11g r2 clusterware(集群软件) 启动顺序 (视频动画)

很久前在网上发现了一个很好的描述oracle 11g r2 集群软件(CLUSTERWARE)记录启动的视频, 不敢独项, 分享给大家。

The FG(server process) and remote node LMSn process communication over the interconnect?(用户进程会和另一节点的LMS进程直接通信么?)

应该存在local SERVER process和remote LMS process通信, 原厂的工程师给我看了他们的白皮书来证明应该只有LMS和LMS通信,究竟是什么情况, LMS会不会直接和远程的SERVER进程通信?

,

Listener no register service& INTERMEDIATE status with “Not All Endpoints Registered” in 11gR2 RAC

是一套11GR2 的RAC 环境, CRSCTL CHECK CRS检查CRS 服务已无法通讯,当时也让他查询了crsd.bin 进程确认不存在了, 当时通知重启CRS便可以解决,但是后来通知客户端依旧有个节点无法连接,检查LISTNER 并没有注册任何SERVICE,而且当时也只监听在PUBLIC IP, 检查DB PARAMETER LOCAL_LISTENER 是绑定VIP,

,

kjfspseudorcfg and kjxgrrcfgchk some reason #

kjxgrrcfgchk: Initiating reconfig, reason=3 #######<<<<<<<<< kjxgrrcfgchk: COMM rcfg - Disk Vote Required kjfmReceiverHealthCB_CheckAll: Recievers are healthy.

,

Troubleshooting RAC intrance crash caused by private network IP address conflict (IP冲突)

今天有套rac 的一个节点主机重启,CRS没有启动, 10.2.0.5 2nodes rac on aix. 手动启动CRS依旧没有拽起来,下面整理一下错误过程… DUPLICATE IP ADDRESS DETECTED IN THE NET

, ,

How to drop ASM DiskGroup in RAC? (DISKGROUP删除后db资源显示OFFLINE)

to chose sqlplus “drop diskgroup”. however,the diskgroup that dropped is still listed as a resource. , Actually, the DB is open still, check alert log

ora-00313,ora-00312 ora-17503, ora-15001

,

荆棘载途 RAC安装之NFS篇

这两天在NFS上配置一套RAC玩,实在没有好的机器就是三台PC,内存2G,每个PC还单块物理网卡,不过最终安装成功。下面记录安装过程中的一些小问题。

Changing SYS Password in RAC (修改SYS密码)

We know that changed a normal user password has no diffrence between a single instance database and a RAC database,we just have to perform “alter user xxx identified by xxx” and the password will be cheanged.

配置11Gr2 physical dataguard,11203 RAC 到single instance

primary db是2nodes 11203 RAC,physical standby是11203 的single instance,Primary db有用到ASM存储文件,Single instance只用local filesystem

remove a node from 11g r2 rac on OEL5(删除节点)

[grid@znode1 bin]$ cluvfy stage -post nodedel -n racnode3 -verbose
Performing post-checks for node removal
Checking CRS integrity…
Clusterware version consistency passed
The Oracle Clusterware is healthy on node “znode2”
The Oracle Clusterware is healthy on node “znode1”
CRS integrity check passed
Result:
Node removal check passed
Post-check for node removal was successful.

Oracle 11g R2 RAC addnode (增加RAC节点) 实践和注意事项

1,配置OS环境
2,安装GRID (clusterware)到第三个节点
3,安装ORACLE RDBMS SOFTWARE到第三个节点
4, 增加oracle instance 到第三个节点

, ,

ORA-01078: failure in processing system parameters when srvctl start instance

[grid@znode1 ~]$ srvctl start instance -d rac -i rac1
PRCR-1013 : Failed to start resource ora.rac.db
PRCR-1064 : Failed to start resource ora.rac.db on node znode1
CRS-5017: The resource action “ora.rac.db start” encountered the following error:
ORA-01078: failure in processing system parameters

,

11203 RAC addNode.sh 遇PRKC-PRCF-2015 PRCF-2023

PRCF-2023 : The following contents are not transferred as they are non-readable.
Files:
1) /u01/app/11.2.0/grid/OCRDUMPFILE
2) /u01/app/11.2.0/grid/locr.lst

,

修改11g r2 RAC SCAN IP

SCAN简化了客户端连接的配置,只需指定SCAN Name在客户端的tnsnames.ora,不需要知道每个节点的vip就可以实现负载均衡
每个节点上的pmon会经常的发送本节点的负载情况到scan listener…

11G R2 Oracle Local Registry(OLR)

从oracle 11R2版本起在网格计算领域引进了新特性,其中之一就是grid Oracle Local Registry(OLR),做为oracle clusterware的一部分,有些人喜欢叫Oracle Local Repository,因为这repositery记录了本地资源的信息和配置..

after OS reboot ,11g RAC asm not automatic startup ORA-27102

[ohasd(4507)]CRS-2807:Resource ‘ora.asm’ failed to start automatically.
SQL> startup nomount
ORA-27102: out of memory

,

rac 11203 grid 安装PRVF-5636 : The DNS response time for an unreachable node exceeded “15000” ms

OUI默认是带域名的名称如znode1.anbob.com/znode1-vip.anbob.com,去掉改为了znode1/znode1-vip 新增加的znode2/znode2-vip,再下一步到检查的时,检查通过。

,