ubifs关键技术_tribanana

(`struct ubifs_data_node`)，或inode节点(`struct ubifs_ino_node`)，其表示VFS的inode，所有的node类型都有个通用头结构(`ubifs_ch`)，
为了避免在每次变化时重写B+树，B+树被实现为wandering树，只有被改变的节点会被重写，老的节点会被废除但不会马上删除，这样索引不会只存储在一个地方，而是四处游走，且在flash上会有多个过时版本，只要存储它们的LEB没有被回收，为了找到最新版本的索引，UBIFS在LEB1存储了一种特殊的节点成为*master node*，它总是指向最新的UBIFS索引的根节点，为了可恢复性，UBIFS定义了LEB1的备份，在LEB2，这样在加载UBIFS时，系统只需要读取LEB1或者LEB2中得到最新的索引
TNC是flash上的索引在内存中的表现，包含了一些运行时的非持续属性，例如节点变动标记指示下次写flash时节点必须更新到flash，TNC作为一个回写缓存机制，所有对flash的索引的修改都要通过TNC，和其他的缓存一样，TNC没有必要镜像所有的索引到内存，而是读取需要的部分，commit操作把更新flash上的内容如索引，在每一次commit操作中，标记为dirty的TNC节点被写入flash以更新索引。

### Journal日志信息
为了避免flash磨损，索引更新只有在一定条件满足时回写，日志用来记录在索引的commit间隔的任何更新(以inode和数据节点的格式)，在加载时，日志从flash读取并在内存重构为TNC。UBIFS为日志信息预留了一束LEB来
，称为*log area*，log area的LEB数量在文件系统创建时是可配置的，并存储在超级块节点，在执行索引commit时，log area中指包含2中类型的节点：*reference nodes*和*commit start nodes*；在回写索引时，commit start node节点也被写入，Reference nodes在每次日志更新时都被写入，每个参考节点指向其他节点在flash上的位置(inode nodes, data nodes etc.)，这些节点是flash上的日志实体的一部分，这些节点被称为*buds*，用来描述实际的文件系统变化以及它们的文件数据.
log area构建为一个环形结构，在日志信息快满时，发起commit操作，commit start node节点被写入，在加载时UBIFS查找最新的commit start node节点并重构之后的每一个reference node，commit start node之前的所有reference node会被忽略，因为这些节点已经存在于flash的索引；在写入一个日志实体时，UBIFS首先保证有足够的空间来写入reference node和该实体的bud部分，然后referene node被写入，随后是描述文件系统变化的bud部分，在重构时，UBIFS记录每一个referene node并检查被引用的LEB的位置来发现bud，如果这些都损坏或丢失了，UBIFS会试图重新读取LEB来恢复，但这只是针对日志信息中最后一个 referenced LEB，因为只有这才会由于断电而损坏，如果不能恢复，UBIFS将不能加载。

LEB属性树/表{LPT}

LEB属性树用来存储每个LEB的信息，这包括LEB类型和以及LEB上的空闲和*dirty* (old, obsolete content)空间；类型信息是重要的，因为UBIFS重不在同一个LEB混用索引节点和数据节点，因此每个LEB都有特定目的，这也有利于计算空闲空间；LEB属性树也是个B+树，但比索引树(TNC)要小很多，因此在每次commit时可以写入到一个chunk中，故而保存LPT树是个原子操作。
LPT使用B+树统计系统来统计系统中逻辑块的使用状态，UBIFS通过遍历LPT来获得具体逻辑块的使用情况，LPT使用以下数据表示一个块的使用状态：free space，表示块中的剩余空间，数据可以写入free space；dirty space，表示一个块中的无效数据，垃圾回收子系统可以回收的空间；flags代表该块是否是索引块。结构定义如下：
    struct ubifs_lprops {
       int free;
       int dirty;
       int flags;
       ...........
    };
LPT组织结构如图所示：

图中，空白方块表示内部节点，即ubifs_nnode结构；最下一层的叶子节点存储逻辑块的使用信息，称为ubifs_pnode结构，内部节点只提供索引pnode的功能，UBIFS定义的每个节点扇出系数是4，所以每个pnode节点可以表示4个逻辑块的信息。LPT树的高度保存在ubifs_info->lpt_hght中，root nnode的level是ubifs_info->lpt_hght，child nnode level为parent level减1，最低的nnode节点level为1; pnode的level为0。LPT属性树对于查找空闲空间，发现无效数据最多块以运行垃圾回收机制是最基本的。

ubifs文件系统在初始时调用mount_ubifs函数加载ubifs，此函数在nand flash上内容为空时会以缺省参数创建ubifs系统，调用过程：
    mount_ubifs-》
     ubifs_read_superblock-》
      create_default_filesystem-》
       ubifs_create_dflt_lpt
ubifs_create_dflt_lpt函数会在LPT区域创建缺省的LPT表信息，LPT有两种不同的形式big model和small model，使用哪种模式由LEB Properties table的大小决定的，当整个LEB Properties table可以写入单个eraseblock中时使用small model，否则big model。对于big model，lsave被用来保存部分重要的LEB properties, 以加快超找特定LEB的速度。LPT信息包括pnode信息，nnode信息，如果LPT是big模式，后面跟着LPT's save table信息，如果LPT信息大于一个LEB块大小，则后面跟着LPT表自身的LPT属性表信息ltab。
在on-flash的LPT信息内容中，所有信息都以位域结构来表示，LPT属性结构可以表示为：
    struct ubifs_pnode {
      unsigned short crc;
      int type: 4；
#if BIG_MODE
   int num : pcnt_bits；
#endif
      struct {
        int freespace : space_bits；
        int dirty   ：space_bits；
        int properflg ：1； // 类型为LPROPS_INDEX置位，其他为0；
      } lprops[UBIFS_LPT_FANOUT];
    }
    每一个pnode的缺省扇出系数是4，即可以表示4个LEB的属性，第一个pnode包含了root inode node的LEB
    属性和root index node of the index tree的LEB属性,随后所有的pnode节点按序写入LPT区域。

在pnode节点信息之后的是nnode信息，nnode节点是层次结构的，on-flash上最开始的是最下一行的nnode信
息，直到根节点信息在最后。nnode节点信息在on-flash上的结构示意如下：
    struct ubifs_nnode {
      unsigned short crc;
      int type: 4；
#if BIG_MODE
   int num : pcnt_bits；
#endif
      struct {
        int lnum : lpt_lnum_bits；
        int loff ：lpt_offs_bits；
        int properflg ：1； // 类型为LPROPS_INDEX置位，其他为0；
      } nbranch[UBIFS_LPT_FANOUT];
    }
如果是big模式,则紧跟在nnode树信息后的事lsave表,缺省lsave表有256项,内容为主存储LEB块序号,
在on-flash上的表示为：
    struct ubifs_nnode {
      unsigned short crc;
      int type: 4；
      int lsave[i] ：lnum_bits；// 256 个
    }
最后是ltab内容，即LPT表自身的LEB属性表。在on-flash上的结构示意如下：
    struct ubifs_nnode {
      unsigned short crc;
      int type: 4；
      struct {
       int free : lpt_spc_bits;
       int dirty : lpt_spc_bits;
      } [lpt_lebs];
    }

其他节点的初始化

   在完成LPT的on-flash初始化后,接着完成UBIFS的超级块初始化,超级块位于文件系统的LEB0/0的位置,信息
   定义在结构ubifs_sb_node中：在初始化后，调用ubifs_write_node(c, sup, UBIFS_SB_NODE_SZ, 0, 0);
   写入flash的LEB0，offset 0位置。
    struct ubifs_sb_node {
      struct ubifs_ch ch;
      __u8 padding[2];
      __u8 key_hash;      // type of hash function used in keys
      __u8 key_fmt;       // format of the key
      __le32 flags;       // file-system flags (%UBIFS_FLG_BIGLPT, etc)
      __le32 min_io_size;    // minimal input/output unit size， e.g 128K
      __le32 leb_size;     // logical eraseblock size in bytes e.g. 128K- 2k -2k
      __le32 leb_cnt;      // count of LEBs used by file-system
      __le32 max_leb_cnt;    // maximum count of LEBs used by file-system
      __le64 max_bud_bytes;   // maximum amount of data stored in buds
      __le32 log_lebs;     //   log size in logical eraseblocks
      __le32 lpt_lebs;     // number of LEBs used for lprops table
      __le32 orph_lebs;    // number of LEBs used for recording orphans
      __le32 jhead_cnt;    // count of journal heads
      __le32 fanout;      // tree fanout (max. number of links per indexing node)
      __le32 lsave_cnt;    // number of LEB numbers in LPT's save table
      __le32 fmt_version;
      __le16 default_compr; // default compression algorithm (%UBIFS_COMPR_LZO, etc)
      __u8 padding1[2];
      __le32 rp_uid;     //
      __le32 rp_gid;
      __le64 rp_size;    // size of the reserved pool in bytes
      __le32 time_gran;
      __u8 uuid[16];
      __le32 ro_compat_version;
      __u8 padding2[3968];
    };
主节点存储了所有不在固定逻辑位置的on-flash结构，主节点不断重复地写入到LEB1和LEB2，LEB1和LEB2是相同的，以用于主节点区域可能的失效，初始化时第一个缺省的master node如下，会同时写入LEB1和LEB2.
    mst->ch.node_type   = UBIFS_MST_NODE;
   mst->log_lnum     = cpu_to_le32(UBIFS_LOG_LNUM); //log区域起始LEB,缺省为3
   mst->highest_inum   = cpu_to_le64(UBIFS_FIRST_INO); 在committed index的最高inode号
   mst->cmt_no      = 0;   commit number
   mst->root_lnum     = cpu_to_le32(main_first + DEFAULT_IDX_LEB); root index node的leb号
   mst->root_offs     = 0;    offset within @root_lnum
   tmp = ubifs_idx_node_sz(c, 1);
   mst->root_len      = cpu_to_le32(tmp); root indexing node length
   mst->gc_lnum      = cpu_to_le32(main_first + DEFAULT_GC_LEB);
   mst->ihead_lnum     = cpu_to_le32(main_first + DEFAULT_IDX_LEB);
   mst->ihead_offs    = cpu_to_le32(ALIGN(tmp, c->min_io_size)); LEB number of index head
   mst->index_size    = cpu_to_le64(ALIGN(tmp, 8));
   mst->lpt_lnum     = cpu_to_le32(c->lpt_lnum); LEB number of LPT root nnode
   mst->lpt_offs     = cpu_to_le32(c->lpt_offs);
   mst->nhead_lnum    = cpu_to_le32(c->nhead_lnum); LEB number of LPT head
   mst->nhead_offs    = cpu_to_le32(c->nhead_offs);
   mst->ltab_lnum     = cpu_to_le32(c->ltab_lnum);
   mst->ltab_offs     = cpu_to_le32(c->ltab_offs);
   mst->lsave_lnum    = cpu_to_le32(c->lsave_lnum);
   mst->lsave_offs    = cpu_to_le32(c->lsave_offs);
   mst->lscan_lnum    = cpu_to_le32(main_first);
   mst->empty_lebs    = cpu_to_le32(main_lebs - 2);
   mst->idx_lebs     = cpu_to_le32(1);
   mst->leb_cnt      = cpu_to_le32(c->leb_cnt);
   tmp64 = main_bytes;
   tmp64 -= ALIGN(ubifs_idx_node_sz(c, 1), c->min_io_size);
   tmp64 -= ALIGN(UBIFS_INO_NODE_SZ, c->min_io_size);
   mst->total_free = cpu_to_le64(tmp64);
   tmp64 = ALIGN(ubifs_idx_node_sz(c, 1), c->min_io_size);
   ino_waste = ALIGN(UBIFS_INO_NODE_SZ, c->min_io_size) - UBIFS_INO_NODE_SZ;
   tmp64 += ino_waste;
   tmp64 -= ALIGN(ubifs_idx_node_sz(c, 1), 8);
   mst->total_dirty = cpu_to_le64(tmp64);
   tmp64 = ((long long)(c->main_lebs - 1) * c->dark_wm);
   mst->total_dark = cpu_to_le64(tmp64);
   mst->total_used = cpu_to_le64(UBIFS_INO_NODE_SZ);
   err = ubifs_write_node(c, mst, UBIFS_MST_NODE_SZ, UBIFS_MST_LNUM, 0); // 写入LEB1
   if (err) {
        kfree(mst);
        return err;
   }
   err = ubifs_write_node(c, mst, UBIFS_MST_NODE_SZ, UBIFS_MST_LNUM + 1,0） // 写入LEB2
================================================================
log区域的第一个节点是commit start node，在创建缺省UBIFS系统时，会在log区域写入一个fake cs节点以确保文件系统的正常工作：

   tmp = ALIGN(UBIFS_CS_NODE_SZ, c->min_io_size);
   cs = kzalloc(tmp, GFP_KERNEL);
   if (!cs)
        return -ENOMEM;

   cs->ch.node_type = UBIFS_CS_NODE;
   err = ubifs_write_node(c, cs, UBIFS_CS_NODE_SZ, UBIFS_LOG_LNUM, 0);
   kfree(cs);
   if (err)
        return err;
-----------------------------------------------------------
log区域为日志系统的一部分，从LEB3开始，占用空间不确定。UBIFS使用日志的目的是为了减少对flash index的更新频率，因为更新文件系统时，一旦添加叶子节点，整个文件系统的索引节点都要定期更新，这样的话会非常影响效率。因此采用日志区间，当添加叶子节点时，会先将其添加到日志中，只更新内存中的节点，不再提交到flash中，然后再定期提交日志，这样的话，效率会有极大的提高。log记录了日志信息

ubifs关键技术

linux

ubifs