首页 » ORACLE 9i-23ai, 系统相关 » Troubleshooting Oracle instance start failed with ORA-7445 [ipcor_net_get_ibdevname]

Troubleshooting Oracle instance start failed with ORA-7445 [ipcor_net_get_ibdevname]

最近,有一位海南客户报告了Oracle 19c RAC数据库启动时出现的错误,提示ORA-07445: exception encountered: core dump [ipcor_net_get_ibdevname()+71][SIGSEGV]。这个崩溃报告的异常原因是由于Oracle的一个bug引起的,但根本原因是由于数据库无法访问某些特定设备的API而导致的。通常这样的问题源于硬件方面的原因。在这里,我只是简要记录一下问题的表现。

DB alert日志

Bug 34487786 Instance startup fails with ORA-7445 [ipcor_net_get_ibdevname]

During database startup, for certain devices, IPC API to access them were failing.
Which is expected behavior but code path for failure handling had issues which was resulting in crash.

REDISCOVERY INFORMATION:
If DB instance crashes during startup with ORA-07445: exception encountered: core dump [ipcor_net_get_ibdevname()+71][SIGSEGV],
then you are hitting this bug.

The fix for 34487786 is first included in
19.19.0.0.230418 (April 2023) DB Release Update (DB RU)

操作系统日志/var/log/message


提示设备的00:b0:02.0硬件错误,

确认设备

# lspci|grep -i b0:02.0
b0:02.0 PCI bridge: Inter Corporation Device 347A

# lspci -vvs b0:02.0
b0:02.0 PcI bridge: Intel corporation Device 347a (rev 04) (prog-if 00 [Normal decode])
Control: I/0+ Mem+ BusMaster+ Speccycle- MemwINV- VGAsnoop- ParErr+ Stepping- SERR+ FastB2B- DiSINTX*
Status: Cap+ 66MHz- UDF. FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- <TAbort-Reset- FastB2B-PriDiscTmr- SecDiscTmr-DiscTmrStat- DiscTmrSERREn-Capabilities:[40] Express(v2)Root Port (Slot+), MSI 00MaxPayload 512PhantFunc Devcap:bytes,ExtTag+ RBE+Devctl:Report errors: Correctable+ Non-Fatal+ Fatal+ UnsupportedRlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop.MaxPayload 512 bytes, MaxReadReg 4096 bytesDevSta:CorrErr- UncorrErr, FatalErr- UnsuppReg- AuxPwr- TransPend.Lnkcap:Port #5, Speed 16GT/s, Width x16, ASPM not supported, Exit Latency Los <lus, L1 <16usClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+Lnkctl:ASPM Disabled; RcB 64 bytes Disabled- Commclk+ExtSynch- clockPM- AutWidDis-BWInt. AutBWInt-SlotClk+ DLActive+ BWMgmt- ABWMgmt-

# find /sys/ -iname ibl
/sys/devices/pci0000:b0/0000:60:02.0/0000:b1:00.0/net/ib1

# ibdev2netdev -v

原因

错误原因 IB1 IB卡损坏,建议更换IB卡。

打赏

,

对不起,这篇文章暂时关闭评论。