DBVerify report Corrupt block “Completely zero block found during dbv” when use RAW Device, but rman not.
数据库11.2.0.3 on aix RAC 使用的是裸设备,在扫描时发现大是不是的corrupted block, 但是使用rman没有发现任何corrupted block无论是物理还是逻辑。从上篇我的笔记中记录了dbv的扫描方式和可能存在的问题DBV not always correct, as in an extreme case the use of raw device,这里简单记录一下这个案例。
SQL> @ls users TABLESPACE_NAME FILE_ID FILE_NAME EXT MB MAXSZ ------------------------------ ---------- -------------------------------- --- ---------- ---------- USERS 815 /dev/rjf_lv16_019 NO 16895 USERS 327 /dev/rjf_lv16_103 NO 16383 USERS 332 /dev/rjf_lv16_104 NO 16383 USERS 676 /dev/rjf_lv16_328 NO 16383 USERS 979 /dev/rjf_lv16_477 NO 16383 USERS 980 /dev/rjf_lv16_478 NO 16383 USERS 680 /dev/rjf_lv32_344 NO 32767 USERS 5 /dev/rjf_user NO 510 8 rows selected. SQL> select block_size from v$datafile where file#=815; BLOCK_SIZE ---------- 32768 oracle@anbob1:/home/oracle>dbv file=/dev/rjf_lv16_019 blocksize=32768 DBVERIFY: Release 11.2.0.3.0 - Production on Mon Sep 5 15:09:32 2016 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. DBVERIFY - Verification starting : FILE = /dev/rjf_lv16_019 Page 540641 is marked corrupt Corrupt block relative dba: 0xcbc83fe1 (file 815, block 540641) Completely zero block found during dbv: Page 540642 is marked corrupt Corrupt block relative dba: 0xcbc83fe2 (file 815, block 540642) Completely zero block found during dbv: Page 540643 is marked corrupt Corrupt block relative dba: 0xcbc83fe3 (file 815, block 540643) Completely zero block found during dbv: Page 540644 is marked corrupt Corrupt block relative dba: 0xcbc83fe4 (file 815, block 540644) Completely zero block found during dbv: Page 540645 is marked corrupt Corrupt block relative dba: 0xcbc83fe5 (file 815, block 540645) Completely zero block found during dbv: Page 540646 is marked corrupt Corrupt block relative dba: 0xcbc83fe6 (file 815, block 540646) Completely zero block found during dbv: Page 540647 is marked corrupt Corrupt block relative dba: 0xcbc83fe7 (file 815, block 540647) Completely zero block found during dbv: Page 540648 is marked corrupt Corrupt block relative dba: 0xcbc83fe8 (file 815, block 540648) .. .. .. Page 552864 is marked corrupt Corrupt block relative dba: 0xcbc86fa0 (file 815, block 552864) Completely zero block found during dbv: DBVERIFY - Verification complete Total Pages Examined : 552959 Total Pages Processed (Data) : 309506 Total Pages Failing (Data) : 0 Total Pages Processed (Index): 228520 Total Pages Failing (Index): 0 Total Pages Processed (Other): 2509 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 105 Total Pages Marked Corrupt : 12319 Total Pages Influx : 0 Total Pages Encrypted : 0 Highest block SCN : 833124224 (3482.833124224) oracle@anbob1:/home/oracle>rman target / Recovery Manager: Release 11.2.0.3.0 - Production on Mon Sep 5 15:14:33 2016 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. connected to target database: BILL (DBID=1844724644) RMAN> backup validate check logical datafile 815; ... .. ... channel ORA_DISK_1: backup set complete, elapsed time: 00:00:55 List of Datafiles ================= File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 815 OK 0 105 540640 14955914242407 File Name: /dev/rjf_lv16_019 Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 309506 Index 0 228520 Other 0 2509 RMAN> backup validate datafile 815; channel ORA_DISK_1: backup set complete, elapsed time: 00:01:05 List of Datafiles ================= File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 815 OK 0 105 540640 14956617523909 File Name: /dev/rjf_lv16_019 Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 311281 Index 0 226748 Other 0 2506
TIP:
DBV工具发现大量的勘误块,但是RMAN 验证未发现任务勘误块
RMAN
for physical corruption check: backup validate datafile ‘filename’;
for logical corruption check: backup check logical validate datafile ‘filename’
then find out whether there is corruption from dynamic view V$DATABASE_BLOCK_CORRUPTION
RMAN Vs DBVerify – Datafile Intra Block Corruption
836658.1 show
It has been a dilemma of what tool to use when identifying intra block corruptions. Here are some comparisons between RMAN and DBV:
- When the logical option is used by RMAN, it does exactly the same checks as DBV does for intra block corruption.
- RMAN can be run with PARALLELISM using multiple channels making it faster than DBV which cannot be run in parallel in a single command. See Note 472231.1 for examples.
- DBV checks for empty blocks. In 10g RMAN may not check blocks in free extents when Locally Managed Tablespaces are used. In 11g RMAN checks for both free and used extents.
- Both DBV and RMAN (11g) can check for a range of blocks. RMAN: VALIDATE DATAFILE 1 BLOCK 10 to 100;. DBV: start=10 end=100
- RMAN keeps corruption information in the control file (v$database_block_corruption, v$backup_corruption). DBV does not.
- RMAN may not report the corruption details like what is exactly corrupted in a block reported as a LOGICAL corrupted block. DBV reports the corruption details in the screen or in a log file.
- DBV can scan blocks with a higher SCN than a given SCN (HIGH_SCN clause).
- DBV does not need a connection to the database.
- RMAN does not detect the Logical Corruptions described in Doc ID 7517208.8 (DBV does).
# 尝试dump任何一个报错的勘误块失败 SQL> alter system dump datafile 815 block 552861; alter system dump datafile 815 block 552861 * ERROR at line 1: ORA-01410: invalid ROWID -- dba2 select owner, segment_name, partition_name, tablespace_name, extent_id from dba_extents where file_id = &1 and &2 between block_id and block_id + blocks - 1; SQL> @dba2 815 552861 no rows selected SQL> select * 2 from dba_free_space 3 where file_id = 815 4 and 552861 between block_id and block_id + blocks - 1; no rows selected SQL> select * from dba_extents where file_id = 815 order by block_id desc SEGMENT_TYPE TABLESPACE_NAME EXTENT_ID FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO ------------------ ---------------------- ---------- ---------- ---------- ---------- ---------- ------------ TABLE USERS 0 815 540613 196608 6 815 INDEX PARTITION USERS 136 815 540165 8388608 256 815 TABLE USERS 112 815 539653 8388608 256 815 INDEX PARTITION USERS 141 815 539397 8388608 256 815 SQL> select * 2 from dba_free_space 3 where file_id = 815 4 and 552861 between block_id and block_id + blocks - 1; no rows selected SQL> select * from dba_free_space where file_id = 815 order by block_id desc; TABLESPACE_NAME FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO ------------------------------ ---------- ---------- ---------- ---------- ------------ USERS 815 540619 720896 22 815 USERS 815 540421 6291456 192 815 USERS 815 539909 8388608 256 815 oracle@anbob1:/home/oracle>dd if=/dev/rjf_lv16_019 of=/tmp/dbf19.dd bs=32768 skip=552840 count=10 10+0 records in. 10+0 records out. oracle@anbob1:/home/oracle>ls -l /tmp/dbf19.dd -rw-r--r-- 1 oracle oinstall 327680 Sep 05 16:21 /tmp/dbf19.dd oracle@anbob1:/home/oracle>od -t x /tmp/dbf19.dd 0000000 00000000 00000000 00000000 00000000 * 1200000
NOTE:
发现DBV提示的勘误块不属于任务EXTENT,而且读取该块发现全是0, 也就是未格式化的块。如果我们DUMP一个有数据的正常块如下:
dd if=/dev/rjf_lv16_019 of=/tmp/dbf19_ok.dd bs=32768 skip=540640 count=2 oracle@anbob1:/home/oracle>od -t x /tmp/dbf19_ok.dd 0000000 06e20000 cbc83fe0 f6437703 0d370106 cbc83fe0 ===> rdba
因为我们使用的是RAW Device(裸设备), 确认一下该文件的大小
SQL> SELECT blocks FROM v$datafile WHERE file#=815; BLOCKS ---------- 540640 SQL> select 540640*32768/1024/1024 from dual; 540640*32768/1024/1024 ---------------------- 16895 oracle@anbob1:/home/oracle>dbfsize /dev/rjf_lv16_019 Database file: /dev/rjf_lv16_019 Database file type: raw device without 4K starting offset Database file size: 552959 32768 byte blocks oracle@anbob1:/home/oracle>lslv jf_lv16_019 LOGICAL VOLUME: jf_lv16_019 VOLUME GROUP: jf_data01 LV IDENTIFIER: 00f84db500004c000000014bc392228f.158 PERMISSION: read/write VG STATE: active/complete LV STATE: opened/syncd TYPE: raw WRITE VERIFY: off MAX LPs: 540 PP SIZE: 32 megabyte(s) COPIES: 1 SCHED POLICY: striped LPs: 540 PPs: 540 STALE PPs: 0 BB POLICY: relocatable INTER-POLICY: maximum RELOCATABLE: no INTRA-POLICY: middle UPPER BOUND: 20 MOUNT POINT: N/A LABEL: None MIRROR WRITE CONSISTENCY: on/ACTIVE EACH LP COPY ON A SEPARATE PV ?: yes (superstrict) Serialize IO ?: NO INFINITE RETRY: no STRIPE WIDTH: 20 STRIPE SIZE: 128k DEVICESUBTYPE: DS_LVZ COPY 1 MIRROR POOL: None COPY 2 MIRROR POOL: None COPY 3 MIRROR POOL: None
Note:
该数据文件在数据库内在实际大小为540640 blocks, 虽然该裸设备有552959 blocks, 其实对于裸设备在使用DBV工具时应该使用END 参数,指定扫描的结束位置。
This is the last datablock to check in the file. This defaults to the last block of the file but may need specifying for RAW devices
For RAW devices you should use the END parameter to avoid running off the end of the Oracle file space.
eg: “dbv FILE=/dev/rdsk/r1.dbf END=”
If you get the END value too high DBV can report the last page/s of the file as corrupt as these are beyond the end of the Oracle portion of the raw device.
我们指定END参数,再次使用DBV扫描。
SQL> select BYTES/32768 from v$datafile where FILE#=815;
BYTES/32768
-----------
540640
oracle@anbob1:/home/oracle>oracle@kdjf1:/home/oracle>dbv file=/dev/rjf_lv16_019 blocksize=32768 end=540640
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Sep 11 14:25:47 2016
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
DBVERIFY - Verification starting : FILE = /dev/rjf_lv16_019
DBVERIFY - Verification complete
Total Pages Examined : 540640
Total Pages Processed (Data) : 322462
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 215474
Total Pages Failing (Index): 0
Total Pages Processed (Other): 2599
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 105
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 2846795710 (3483.2846795710)
NOTE:
这次扫描就没有Current block了,所以之前提示的勘误块可以忽略, 那为什么DBV 会默认扫描和提示那些CORRUPTED block呢, 其实在我上篇笔记里已记录了dbv读取扫描长度的位置,这里验证一下。
oracle@anbob1:/home/oracle>dd if=/dev/rjf_lv16_019 of=/tmp/19head.dd bs=32768 count=10
10+0 records in.
10+0 records out.
oracle@anbob1:/home/oracle>od -x /tmp/19head.dd|head
0000000 00e2 0000 ffc0 0000 0000 0000 0000 0000
0000020 3763 0000 0000 8000 0008 6fff 7a7b 7c7d
0000040 0000 21b0 0000 0000 0000 0000 0000 0000
0000060 0000 0000 0000 0000 0000 0000 0000 0000
*
0100000 0be2 0000 cbc0 0001 0000 0000 0000 0104
0100020 eb31 0000 0000 0000 0b20 0300 6df4 43a4
0100040 4249 4c4c 0000 0000 34ee 5f0a 0008 3fe0
0100060 0000 8000 032f 0003 0000 0000 0000 0000
0100100 0000 0000 0000 0000 0000 0000 0000 0000
SQL> @hex 86fff
DEC HEX
----------------------------------- --------------------
552959.000000 86FFF
NOTE:
到这里问题就清晰了,因为raw device头上记录的是552959 个块数,但数据文件实际为540640个块数, 所以在oracle 未使用的block都未格式化,但是DBV如果不指定end 截至位置都会扫描, 所以DBV会提示那些都是勘误块, 使用RMAN未发现。
对不起,这篇文章暂时关闭评论。