Troubleshooting “sqlplus / as sysdba” hang Case
一套oracle single instance on AIX环境,数据库未启动,客户硬件只是扩容了内存OS重启后,sqlplus启库无法登录,sqlplus / as sysdba命令挂起。
1, rename sqlnet.ora
2, 检查了/etc/hosts
3, 尝试了listener可以启动
4,relink sqlplus后依旧hang, sqlplus /nolog可以
5, 配置监听静态,sqlplus使用密码登录依旧hang
/* Linux */ strace -o sqlplus_hang.log -ttt sqlplus / as sysdba /* AIX */ truss -aefo sqlplus_hang.log sqlplus / as sysdba
# grep open sqlplus_hang.log 行 5: 32373 16:42:18.426092 [00007f3ff87df7a7] open("/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3 <0.000018> 行 9: 32373 16:42:18.426252 [00007f3ff87df7a7] open("/home/dbsectde/libdbsr.so", O_RDONLY|O_CLOEXEC) = 3 <0.000017> 行 18: 32373 16:42:18.426572 [00007f3ff87df7a7] open("/u01/app/oracle/product/11.2.0/db_1/lib/tls/x86_64/libsqlplus.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000015> 行 20: 32373 16:42:18.426647 [00007f3ff87df7a7] open("/u01/app/oracle/product/11.2.0/db_1/lib/tls/libsqlplus.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000014> 行 22: 32373 16:42:18.426707 [00007f3ff87df7a7] open("/u01/app/oracle/product/11.2.0/db_1/lib/x86_64/libsqlplus.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000013> 行 24: 32373 16:42:18.426764 [00007f3ff87df7a7] open("/u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so", O_RDONLY|O_CLOEXEC) = 3 <0.000018> 行 33: 32373 16:42:18.427049 [00007f3ff87df7a7] open("/u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1", O_RDONLY|O_CLOEXEC) = 3 <0.000015> 行 41: 32373 16:42:18.427303 [00007f3ff87df7a7] open("/u01/app/oracle/product/11.2.0/db_1/lib/libnnz11.so", O_RDONLY|O_CLOEXEC) = 3 <0.000016> 行 49: 32373 16:42:18.427567 [00007f3ff87df7a7] open("/u01/app/oracle/product/11.2.0/db_1/lib/libdl.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000013> 行 50: 32373 16:42:18.427599 [00007f3ff87df7a7] open("/u01/app/oracle/product/11.2.0/db_1/lib/libdl.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000012> 行 955: 32374 16:42:18.467506 [00007f5f6f830a30] open("/proc/self/cmdline", O_RDONLY) = 3 <0.000015> 行 963: 32374 16:42:18.467736 [00007f5f6f830a30] open("/home/dbsectde/Oracle_address", O_RDONLY) = 3 <0.000014> 行 1013: 32374 16:42:18.469280 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164218-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000089> 行 1033: 32374 16:42:19.470180 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164219-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000078> 行 1053: 32374 16:42:20.471090 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164220-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000113> 行 1073: 32374 16:42:21.472079 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164221-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000103> 行 1093: 32374 16:42:22.473046 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164222-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000105> 行 1113: 32374 16:42:23.474061 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164223-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000145> 行 1133: 32374 16:42:24.475002 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164224-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000106> 行 1153: 32374 16:42:25.475925 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164225-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000170> 行 1173: 32374 16:42:26.476960 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164226-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000101> ... 循环中 # 查看open /emp/tdehook前部分内容 32374 16:42:18.467911 [00007f5f6f83a0f7] munmap(0x7f5f722cc000, 4096) = 0 <0.000009> 32374 16:42:18.467961 [00007f5f6f83a127] mprotect(0xcb23000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000010> 32374 16:42:18.467996 [00007f5f6f83a127] mprotect(0x2060000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000017> 32374 16:42:18.468044 [00007f5f6f83a127] mprotect(0x2060000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000009> 32374 16:42:18.468074 [00007f5f6f83a127] mprotect(0xf74000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000014> 32374 16:42:18.468110 [00007f5f6f83a127] mprotect(0xf74000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468135 [00007f5f6f83a127] mprotect(0x5805000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000013> 32374 16:42:18.468170 [00007f5f6f83a127] mprotect(0x5805000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000009> 32374 16:42:18.468196 [00007f5f6f83a127] mprotect(0x5805000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468222 [00007f5f6f83a127] mprotect(0x5805000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000009> 32374 16:42:18.468247 [00007f5f6f83a127] mprotect(0x5805000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468272 [00007f5f6f83a127] mprotect(0x5805000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468298 [00007f5f6f83a127] mprotect(0x8545000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000013> 32374 16:42:18.468331 [00007f5f6f83a127] mprotect(0x8545000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000009> 32374 16:42:18.468357 [00007f5f6f83a127] mprotect(0x8544000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000011> 32374 16:42:18.468389 [00007f5f6f83a127] mprotect(0x8544000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000011> 32374 16:42:18.468433 [00007f5f6f83a127] mprotect(0x8545000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000010> 32374 16:42:18.468460 [00007f5f6f83a127] mprotect(0x8545000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000009> 32374 16:42:18.468487 [00007f5f6f83a127] mprotect(0x8544000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000009> 32374 16:42:18.468513 [00007f5f6f83a127] mprotect(0x8544000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468539 [00007f5f6f83a127] mprotect(0x93e3000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000014> 32374 16:42:18.468574 [00007f5f6f83a127] mprotect(0x93e3000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468600 [00007f5f6f83a127] mprotect(0x1ec9000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000013> 32374 16:42:18.468634 [00007f5f6f83a127] mprotect(0x1ec9000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000009> 32374 16:42:18.468659 [00007f5f6f83a127] mprotect(0x961b000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000012> 32374 16:42:18.468691 [00007f5f6f83a127] mprotect(0x961b000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468715 [00007f5f6f83a127] mprotect(0x961b000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468741 [00007f5f6f83a127] mprotect(0x961b000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468766 [00007f5f6f83a127] mprotect(0x1bd9000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000013> 32374 16:42:18.468799 [00007f5f6f83a127] mprotect(0x1bd9000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000009> 32374 16:42:18.468824 [00007f5f6f83a127] mprotect(0xa54a000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000012> 32374 16:42:18.468858 [00007f5f6f83a127] mprotect(0xa54a000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468885 [00007f5f6f83a127] mprotect(0x6478000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000012> 32374 16:42:18.468919 [00007f5f6f83a127] mprotect(0x6478000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.468944 [00007f5f6f83a127] mprotect(0x1860000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC) = 0 <0.000012> 32374 16:42:18.468981 [00007f5f6f83a127] mprotect(0x1860000, 8192, PROT_READ|PROT_EXEC) = 0 <0.000008> 32374 16:42:18.469017 [00007f5f6f8414f7] shmctl(0, SHM_INFO, 0x7ffc719f9120) = 0 <0.000009> 32374 16:42:18.469045 [00007f5f6f8414f7] shmctl(0, SHM_STAT, 0x7ffc719f90b0) = 65536 <0.000009> 32374 16:42:18.469076 [00007f5f6f841467] shmat(65536, 0, 0) = 0x7f5f6ef41000 <0.000013> 32374 16:42:18.469116 [00007f5f6f841497] shmdt(0x7f5f6ef41000) = 0 <0.000012> 32374 16:42:18.469145 [00007f5f6f841467] shmat(65536, 0, SHM_RDONLY) = 0x7f5f6ef41000 <0.000010> 32374 16:42:18.469176 [00007f5f6f841497] shmdt(0x7f5f6ef41000) = 0 <0.000010> 32374 16:42:18.469218 [00007f5f6f830d27] access("/tmp/tdehook", F_OK) = 0 <0.000011> 32374 16:42:18.469247 [00007f5f6f830d27] access("/tmp/tdehook/cdump", F_OK) = 0 <0.000009> 32374 16:42:18.469280 [00007f5f6f830a30] open("/tmp/tdehook/cdump/20211214164218-32374.core", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 <0.000089> 32374 16:42:18.469389 [00007f5f6f83fbe0] lseek(3, 0, SEEK_END) = 0 <0.000018> 32374 16:42:18.469436 [00007f5f6f830907] fchmod(3, 0666) = 0 <0.000012> # grep access sqlplus_hang.log 行 4: 32373 16:42:18.426057 [00007f3ff87df807] access("/etc/ld.so.preload", R_OK) = 0 <0.000010> 行 226: 32373 16:42:18.439559 [00007f3ff4b00d27] access("/u01/app/oracle/product/11.2.0/db_1/network/admin/sqlnet.ora", F_OK) = 0 <0.000012> 行 235: 32373 16:42:18.439865 [00007f3ff4b00d27] access("/home/oracle/.sqlnet.ora", F_OK) = -1 ENOENT (No such file or directory) <0.000012> 行 315: 32373 16:42:18.443221 [00007f3ff4b00d27] access("/u01/app/oracle/product/11.2.0/db_1/network/admin/sqlnet.ora", F_OK) = 0 <0.000012> 行 324: 32373 16:42:18.443543 [00007f3ff4b00d27] access("/etc/intchg.ora", F_OK) = -1 ENOENT (No such file or directory) <0.000009> 行 325: 32373 16:42:18.443574 [00007f3ff4b00d27] access("/u01/app/oracle/product/11.2.0/db_1/network/admin/intchg.ora", F_OK) = -1 ENOENT (No such file or directory) <0.000010> 行 326: 32373 16:42:18.443609 [00007f3ff4b00d27] access("/etc/tnsnav.ora", F_OK) = -1 ENOENT (No such file or directory) <0.000009> 行 327: 32373 16:42:18.443638 [00007f3ff4b00d27] access("/u01/app/oracle/product/11.2.0/db_1/network/admin/tnsnav.ora", F_OK) = -1 ENOENT (No such file or directory) <0.000010> 行 725: 32374 16:42:18.457644 [00007f5f721c9807] access("/etc/ld.so.preload", R_OK) = 0 <0.000010> 行 1011: 32374 16:42:18.469218 [00007f5f6f830d27] access("/tmp/tdehook", F_OK) = 0 <0.000011> 行 1012: 32374 16:42:18.469247 [00007f5f6f830d27] access("/tmp/tdehook/cdump", F_OK) = 0 <0.000009> 行 1031: 32374 16:42:19.470119 [00007f5f6f830d27] access("/tmp/tdehook", F_OK) = 0 <0.000010> 行 1032: 32374 16:42:19.470150 [00007f5f6f830d27] access("/tmp/tdehook/cdump", F_OK) = 0 <0.000009> 行 1051: 32374 16:42:20.471027 [00007f5f6f830d27] access("/tmp/tdehook", F_OK) = 0 <0.000011> 行 1052: 32374 16:42:20.471060 [00007f5f6f830d27] access("/tmp/tdehook/cdump", F_OK) = 0 <0.000010> 行 1071: 32374 16:42:21.472015 [00007f5f6f830d27] access("/tmp/tdehook", F_OK) = 0 <0.000011> -- 循环
Pstack 进程
#0 0x00007fd1aec4b7e0 in __read_nocancel () from /lib64/libpthread.so.0
#1 0x00007fd1afe4d517 in sntpcall_stg2 () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#2 0x00007fd1afe4948b in sntpcall () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#3 0x00007fd1afe45a27 in ntpcon () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#4 0x00007fd1afd7ba15 in ntconn () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#5 0x00007fd1afd337d2 in nsopen () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#6 0x00007fd1afcf71de in nscall1 () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#7 0x00007fd1afcf48c4 in nscall () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#8 0x00007fd1afd91e40 in niotns () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#9 0x00007fd1afe236d1 in nigcall () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#10 0x00007fd1afd9cf58 in osncon () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#11 0x00007fd1afc2c2ab in kpuadef () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#12 0x00007fd1afc07df7 in upiini () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#13 0x00007fd1afbffb82 in upiah0 () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#14 0x00007fd1afc25775 in kpuatch () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#15 0x00007fd1afc0bfbb in OCIServerAttach () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#16 0x00007fd1b24c66a0 in afioci () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#17 0x00007fd1b24960d2 in aficntatt () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#18 0x00007fd1b2495901 in aficntcon () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#19 0x00007fd1b249bb19 in aficoncon () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#20 0x00007fd1b24bea55 in afilon () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#21 0x00007fd1b24b795c in afidrv () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#22 0x0000000000400753 in main ()
#0 0x00007fd1aec4b7e0 in __read_nocancel () from /lib64/libpthread.so.0
#1 0x00007fd1afe4d517 in sntpcall_stg2 () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#2 0x00007fd1afe4948b in sntpcall () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#3 0x00007fd1afe45a27 in ntpcon () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#4 0x00007fd1afd7ba15 in ntconn () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#5 0x00007fd1afd337d2 in nsopen () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#6 0x00007fd1afcf71de in nscall1 () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#7 0x00007fd1afcf48c4 in nscall () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#8 0x00007fd1afd91e40 in niotns () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#9 0x00007fd1afe236d1 in nigcall () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#10 0x00007fd1afd9cf58 in osncon () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#11 0x00007fd1afc2c2ab in kpuadef () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#12 0x00007fd1afc07df7 in upiini () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#13 0x00007fd1afbffb82 in upiah0 () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#14 0x00007fd1afc25775 in kpuatch () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#15 0x00007fd1afc0bfbb in OCIServerAttach () from /u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1
#16 0x00007fd1b24c66a0 in afioci () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#17 0x00007fd1b24960d2 in aficntatt () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#18 0x00007fd1b2495901 in aficntcon () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#19 0x00007fd1b249bb19 in aficoncon () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#20 0x00007fd1b24bea55 in afilon () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#21 0x00007fd1b24b795c in afidrv () from /u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so
#22 0x0000000000400753 in main ()
循环中
note:
the __read_nocancel function comes from the libpthread operating system library , which is a place a server process is expected to wait and is actually in an oracle wait event (SQL*Net message from client).
更换一个同平台同版本的登录正常的测试库,跟踪sqlplus / as sysdba
[oracle@oel7db1 ~]$ grep access sqlplus_hang.log
5629 1639624344.255472 access(“/etc/ld.so.preload”, R_OK) = -1 ENOENT (No such file or directory)
….
发现/etc/ld.so.preload 在测试环境是不存在的, ld.so.preload用于预加载一些类库,通常sudo使用,甚至有时会有安全风险,尝试去掉该文件。
$ move /etc/ld.so.preload /etc/ld.so.preload_bak
sqlplus / as sysdba恢复正常。
对不起,这篇文章暂时关闭评论。