首页 » PostgreSQL/GaussDB » 什么是 pg_filenode.map 文件?

什么是 pg_filenode.map 文件?

在PostgreSQL中,每个数据库对象都有一个唯一的文件节点号,用于标识对象在磁盘上的位置。这些文件节点号与实际的物理文件相对应,例如表的数据文件、索引文件等。pg_filenode.map文件将这些文件节点号与相应的文件路径进行映射,以便数据库系统能够准确地找到和访问这些对象, 通常从数据字典pg_class可以根据relfilenode字段查找对应的文件。试想一下,在数据库启动时,数据库都没有open如何加载pg_class?如同oracle 数据库一样,同样要加载dict对象,dict对象又是在Bootstrap$, 它一般是在datafile 1# block 520#位置,那PG呢? 在pg_class表中能看到像pg_class、pg_attribute等一些特定的表,在relfilenode字段中的查询结果为0,说明也是在db dict的外部。

OpenGauss 5

/c postgres

select oid, relname, relfilenode,reltablespace
from pg_class
where relfilenode = 0 and relkind = 'r'
openGauss-# order by reltablespace,oid;
 oid  |           relname            | relfilenode | reltablespace
------+------------------------------+-------------+---------------
 1247 | pg_type                      |           0 |             0
 1249 | pg_attribute                 |           0 |             0
 1255 | pg_proc                      |           0 |             0
 1259 | pg_class                     |           0 |             0
 7815 | gs_package                   |           0 |             0
 1136 | pg_pltemplate                |           0 |          1664
 1213 | pg_tablespace                |           0 |          1664
 1214 | pg_shdepend                  |           0 |          1664
 1260 | pg_authid                    |           0 |          1664
 1261 | pg_auth_members              |           0 |          1664
 1262 | pg_database                  |           0 |          1664
 2396 | pg_shdescription             |           0 |          1664
 2964 | pg_db_role_setting           |           0 |          1664
 3450 | pg_resource_pool             |           0 |          1664
 3451 | pg_workload_group            |           0 |          1664
 3457 | pg_auth_history              |           0 |          1664
 3460 | pg_user_status               |           0 |          1664
 3464 | pg_app_workloadgroup_mapping |           0 |          1664
 3592 | pg_shseclabel                |           0 |          1664
 4211 | pg_extension_data_source     |           0 |          1664
 5680 | gs_obsscaninfo               |           0 |          1664
 6126 | pg_subscription              |           0 |          1664
 6134 | pg_replication_origin        |           0 |          1664
 9014 | pgxc_group                   |           0 |          1664
 9015 | pgxc_node                    |           0 |          1664
 9022 | pg_job                       |           0 |          1664
 9023 | pg_job_proc                  |           0 |          1664
 9080 | gs_global_config             |           0 |          1664
(28 rows)

Note:
一共有28个 relfilenode=0的r对象, 其中有5个reltablespace=0, (在postgresql中因为不支持package通常是4个)称为 “nailed” catalogs,其余在tablespace 1664名为pg_global ,称为共享表。

test=# select oid,* from pg_tablespace;
 oid  |  spcname   | spcowner | spcacl | spcoptions | spcmaxsize | relative
------+------------+----------+--------+------------+------------+----------
 1663 | pg_default |       10 |        |            |            | f
 1664 | pg_global  |       10 |        |            |            | f
(2 rows)

那些未在pg_class中的关系表就是在pg_filenode.map文件中, 从postgresql 9中引入,是一个特殊的文件,相当于pg系的bootstrap$ file 1,每个数据库有一个文件,整个群集有一个文件。pg_filenode.map

需要注意的是,pg_filenode.map文件是由PostgreSQL自动生成和维护的,一般情况下不需要手动修改或操作该文件。它在数据库启动时会被加载,并在需要时进行更新和维护。如果需要备份或迁移数据库,通常需要同时备份和恢复pg_filenode.map文件,以保证数据库对象的正确映射关系。目前,每个文件都保持在 512 字节(旨在占用一个标准大小的磁盘扇区),以最大程度地降低更新corrupted的风险;相应的 Struct 设计为恰好占用 512 个字节,最多可以包含 62 个条目 (MAX_MAPPINGS) ,每个条目 8 个字节. 前后有4字节的文件头、4字节的map个数和8字节文件尾校验值。4+4+62*8+8=512

pg_filenode.map文件结构
源码relmapper.c

/*-------------------------------------------------------------------------
   2  *
   3  * relmapper.c
   4  *    Catalog-to-filenumber mapping
   5  *
   6  * For most tables, the physical file underlying the table is specified by
   7  * pg_class.relfilenode.  However, that obviously won't work for pg_class
   8  * itself, nor for the other "nailed" catalogs for which we have to be able
   9  * to set up working Relation entries without access to pg_class.  It also
  10  * does not work for shared catalogs, since there is no practical way to
  11  * update other databases' pg_class entries when relocating a shared catalog.
  12  * Therefore, for these special catalogs (henceforth referred to as "mapped
  13  * catalogs") we rely on a separately maintained file that shows the mapping
  14  * from catalog OIDs to filenumbers.  Each database has a map file for
  15  * its local mapped catalogs, and there is a separate map file for shared
  16  * catalogs.  Mapped catalogs have zero in their pg_class.relfilenode entries.
  17  *
  18  * Relocation of a normal table is committed (ie, the new physical file becomes
  19  * authoritative) when the pg_class row update commits.  For mapped catalogs,
  20  * the act of updating the map file is effectively commit of the relocation.
  21  * We postpone the file update till just before commit of the transaction
  22  * doing the rewrite, but there is necessarily a window between.  Therefore
  23  * mapped catalogs can only be relocated by operations such as VACUUM FULL
  24  * and CLUSTER, which make no transactionally-significant changes: it must be
  25  * safe for the new file to replace the old, even if the transaction itself
  26  * aborts.  An important factor here is that the indexes and toast table of
  27  * a mapped catalog must also be mapped, so that the rewrites/relocations of
  28  * all these files commit in a single map file update rather than being tied
  29  * to transaction commit.
  30  *
  31  * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
  32  * Portions Copyright (c) 1994, Regents of the University of California
  33  *
  34  *
  35  * IDENTIFICATION
  36  *    src/backend/utils/cache/relmapper.c
  37  *
  38  *-------------------------------------------------------------------------
  39  */
  
  59 
  60 /*
  61  * The map file is critical data: we have no automatic method for recovering
  62  * from loss or corruption of it.  We use a CRC so that we can detect
  63  * corruption.  Since the file might be more than one standard-size disk
  64  * sector in size, we cannot rely on overwrite-in-place. Instead, we generate
  65  * a new file and rename it into place, atomically replacing the original file.
  66  *
  67  * Entries in the mappings[] array are in no particular order.  We could
  68  * speed searching by insisting on OID order, but it really shouldn't be
  69  * worth the trouble given the intended size of the mapping sets.
  70  */
  71 #define RELMAPPER_FILENAME      "pg_filenode.map"
  72 #define RELMAPPER_TEMP_FILENAME "pg_filenode.map.tmp"
  73 
  74 #define RELMAPPER_FILEMAGIC     0x592717    /* version ID value */
  75 
  76 /*
  77  * There's no need for this constant to have any particular value, and we
  78  * can raise it as necessary if we end up with more mapped relations. For
  79  * now, we just pick a round number that is modestly larger than the expected
  80  * number of mappings.
  81  */
  82 #define MAX_MAPPINGS            64    <<<<<<<<<<<<<<
  83 
  84 typedef struct RelMapping
  85 {
  86     Oid         mapoid;         /* OID of a catalog */
  87     RelFileNumber mapfilenumber;    /* its rel file number */
  88 } RelMapping;
  89 
  90 typedef struct RelMapFile
  91 {
  92     int32       magic;          /* always RELMAPPER_FILEMAGIC */
  93     int32       num_mappings;   /* number of valid RelMapping entries */
  94     RelMapping  mappings[MAX_MAPPINGS];
  95     pg_crc32c   crc;            /* CRC of all above */
  96 } RelMapFile;
  97 

数据目录查找

[og@oel7db1 db]$ find . -name pg_filenode.map -exec ls -l {} \;
-rw------- 1 og og 4096 May  8 11:37 ./base/1/pg_filenode.map
-rw------- 1 og og 4096 May  8 11:37 ./base/15650/pg_filenode.map
-rw------- 1 og og 4096 May 10 11:49 ./base/16385/pg_filenode.map
-rw------- 1 og og 4096 May  8 11:37 ./base/15645/pg_filenode.map
-rw------- 1 og og 4096 Sep  5 18:23 ./base/16549/pg_filenode.map
-rw------- 1 og og 4096 May  8 11:37 ./global/pg_filenode.map

anbob=# select oid,datname,dattablespace from pg_database;
  oid  |  datname  | dattablespace
-------+-----------+---------------
     1 | template1 |          1663
 16385 | anbob     |          1663
 16549 | test      |          1663
 15645 | template0 |          1663
 15650 | postgres  |          1663
(5 rows)

dump 文件结构

[og@oel7db1 16549]$ hexdump pg_filenode.map
0000000 2718 0059 0015 0000 04eb 0000 3b35 0000   -- RELMAPPER_FILEMAGIC、num_mappings x0015=21
0000010 04e1 0000 3b22 0000 04e7 0000 3b26 0000
0000020 04df 0000 3ac7 0000 1e87 0000 3b2e 0000
0000030 0b14 0000 3b28 0000 0b15 0000 3b2a 0000
0000040 1f42 0000 3b30 0000 1f43 0000 3b32 0000
0000050 0a62 0000 3b24 0000 0a63 0000 3b25 0000
0000060 0a66 0000 3b37 0000 0a67 0000 3b38 0000
0000070 26fd 0000 3b39 0000 0a82 0000 3b2b 0000
0000080 25c2 0000 3b2d 0000 24a2 0000 3b2c 0000
0000090 2709 0000 3b34 0000 2608 0000 3b33 0000
00000a0 0a8f 0000 3ac9 0000 0a90 0000 3aca 0000
00000b0 0000 0000 0000 0000 0000 0000 0000 0000
*
0000ff0 0000 0000 0000 0000 06a9 e617 0000 0000
0001000

Note:
第一部分是magic文件头,第二部分是21个map条目个数,第三部分是21个具体的oid与relfilenode对应,第四部分是基于CRC算法的校验值

dump一个global pg_filenode.map

[og@oel7db1 db]$ od -j 8 -N $((512-8-8)) -td4 ./global/pg_filenode.map
0000010        1262       15646        2964       15289
0000030        1213       15295        1136       15299
0000050        1260       15054        1261       15302
0000070        1214       15306        2396       15310
0000110        3457       15562        3460       15558
0000130        9015       15355        9014       15360
0000150        3450       15367        3451       15371
0000170        3464       15375        3592       15399
0000210        9022       15430        9023       15437
0000230        4211       15566        5680       15448
0000250        9080       15467        6134       15536
0000270        6126       15134        2846       15312
0000310        2847       15314        2966       15291
0000330        2967       15293        5504       15362
0000350        5505       15364        2676       15056
0000370        2677       15057        2694       15304
0000410        2695       15305        2671       15648
0000430        2672       15649        7166       15568
0000450        7167       15569        2397       15315
0000470        1137       15301        1232       15308
0000510        1233       15309        2697       15297
0000530        2698       15298        9010       15358
0000550        9024       15359        9012       15365
0000570        9013       15366        9000       15369
0000610        9017       15370        9018       15373
0000630        9019       15374        9020       15377
0000650        9021       15378        9003       15357
0000670        2965       15294        3593       15401
0000710        3458       15564        3459       15565
0000730        3461       15560        3462       15561
0000750        3453       15432        3454       15433
0000770

openGauss=#  select pg_relation_filenode(1213);
 pg_relation_filenode
----------------------
                15295
(1 row)

Note:
正好是62个,如oid 1213 对应 RelFileNumber 15295,使用pg_relation_filenode 函数也可以转换确认.

dump 非global 库下的pg_filenode.map

[og@oel7db1 db]$ od -j 8 -N $((512-8-8)) -td4 ./base/16549/pg_filenode.map
0000010        1259       15157        1249       15138
0000030        1255       15142        1247       15047
0000050        7815       15150        2836       15144
0000070        2837       15146        8002       15152
0000110        8003       15154        2658       15140
0000130        2659       15141        2662       15159
0000150        2663       15160        9981       15161
0000170        2690       15147        9666       15149
0000210        9378       15148        9993       15156
0000230        9736       15155        2703       15049
0000250        2704       15050           0           0
0000270           0           0           0           0
*
0000770
[og@oel7db1 db]$ od -j 8 -N $((512-8-8)) -td4 ./base/15650/pg_filenode.map
0000010        1259       15157        1249       15138
0000030        1255       15142        1247       15047
0000050        7815       15150        2836       15144
0000070        2837       15146        8002       15152
0000110        8003       15154        2658       15140
0000130        2659       15141        2662       15159
0000150        2663       15160        9981       15161
0000170        2690       15147        9666       15149
0000210        9378       15148        9993       15156
0000230        9736       15155        2703       15049
0000250        2704       15050           0           0
0000270           0           0           0           0
*
0000770

[og@oel7db1 db]$ od -j 8 -N $((512-8-8)) -td4 ./base/16385/pg_filenode.map
0000010        1259       15157        1249       15138
0000030        1255       15142        1247       15047
0000050        7815       15150        2836       15144
0000070        2837       15146        8002       15152
0000110        8003       15154        2658       15140
0000130        2659       15141        2662       15159
0000150        2663       15160        9981       15161
0000170        2690       15147        9666       15149
0000210        9378       15148        9993       15156
0000230        9736       15155        2703       15049
0000250        2704       15050           0           0
0000270           0           0           0           0
*
0000770

非global下的pg_filenode.map或叫local map file的内容基本上是一样的,因为很少对catalog 对象做vacuum full(会导致relfilenode改变)。 所以如果当Local mapfile 丢失或误删除,影响该库的访问时,可以从其它数据库先copy一份pg_filenode.map做为恢复尝试。

加载relmap的函数

/*
 755  * load_relmap_file -- load the shared or local map file
 756  *
 757  * Because these files are essential for access to core system catalogs,
 758  * failure to load either of them is a fatal error.
 759  *
 760  * Note that the local case requires DatabasePath to be set up.
 761  */
 762 static void
 763 load_relmap_file(bool shared, bool lock_held)
 764 {
 765     if (shared)
 766         read_relmap_file(&shared_map, "global", lock_held, FATAL);
 767     else
 768         read_relmap_file(&local_map, DatabasePath, lock_held, FATAL);
 769 }
 770 
 771 /*
 772  * read_relmap_file -- load data from any relation mapper file
 773  *
 774  * dbpath must be the relevant database path, or "global" for shared relations.
 775  *
 776  * RelationMappingLock will be acquired released unless lock_held = true.
 777  *
 778  * Errors will be reported at the indicated elevel, which should be at least
 779  * ERROR.
 780  */
 781 static void
 782 read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel)
 783 {
 784     char        mapfilename[MAXPGPATH];
 785     pg_crc32c   crc;
 786     int         fd;
 787     int         r;
 788 
 789     Assert(elevel >= ERROR);
 

postgresql提供了一组功能函数可以查询oid与relfilenodel转换

openGauss=#  select pg_relation_filenode(8003);
 pg_relation_filenode
----------------------
                15154
(1 row)

openGauss=# select pg_filenode_relation(0,15154)::oid; --0 表未默认表空间
 pg_filenode_relation
----------------------
                 8003
(1 row)

openGauss=# select pg_filenode_relation(0,15154);
     pg_filenode_relation
------------------------------
 pg_toast.pg_toast_7815_index
(1 row)

当数据库打开始可以使用上面的方法转换,如果数据库未打开始,对于非普通表以外的就是使用pg_filenode.map得到oid与relfilenode的对应关系,然后可以找到对应的db目录,从对应db目录下找对应的pg_class得到对应的名称.

使用第三方插件pg_filenodemapdata可以 转换硬编码的名称转换

postgres@oel7db1 16447]$ pg_filenodemapdata  ./pg_filenode.map
magic:               0x00592717
num_mappings:        17

 0) 1259 - pg_class:                                  1259
 1) 1249 - pg_attribute:                              1249
 2) 1255 - pg_proc:                                   1255
 3) 1247 - pg_type:                                   1247
 4) 2836 - pg_toast_1255:                             2836
 5) 2837 - pg_toast_1255_index:                       2837
 6) 4171 - unlisted system catalog relation:          4171
 7) 4172 - unlisted system catalog relation:          4172
 8) 2658 - pg_attribute_relid_attnam_index:           2658
 9) 2659 - pg_attribute_relid_attnum_index:           2659
10) 2662 - pg_class_oid_index:                        2662
11) 2663 - pg_class_relname_nsp_index:                2663
12) 3455 - pg_class_tblspc_relfilenode_index:         3455
13) 2690 - pg_proc_oid_index:                         2690
14) 2691 - pg_proc_proname_args_nsp_index:            2691
15) 2703 - pg_type_oid_index:                         2703
16) 2704 - pg_type_typname_nsp_index:                 2704

file checksum:       0x546114BB

对于opengauss数据库下对于pg_package未编码,所以可能是未知。

对于表文件路径的样式有三类

* For files in the default tablespace, base/database_oid/filenode id for the relation
* For files in other tablespaces: pg_tblspc / tablespace_oid / tablespace_version_subdir / database_oid / filenode id for the relation
* For shared relations (see below): global/filenode id for the relation

如果pg_class中relfilenode为零,在这种情况下,文件通过pg_relfilenode.map定位。这是共享目录和某些系统目录的典型情况,例如pg_database、pg_class和pg_proc等。

— over —

打赏

对不起,这篇文章暂时关闭评论。