YAFFS2
Contents
YAFFS2 Overview
This page provides a quick overview of the YAFFS2 file system. For a more complete description, see How Yaffs Works.
Details on how TSK implements YAFFS2 can be found in YAFFS2 Implementation Notes. The description below should be enough to understand the basic implementation.
YAFFS2 Terms
- Chunk : Data unit consisting of a page and spare area (you can think of a chunk as a cluster in NTFS/FAT -- with some extra spare area that is not storing content)
- Block : Group of chunks ( a block is the unit of erasure)
- Object : A YAFFS2 file/directory/etc
- Object ID : Unique identifier for each object (you can think of this as the meta data address, but we use a different meta data address to better deal with different versions of an object)
- Chunk ID : Position of this chunk in the file (0 = header, 1 = first chunk with content, 2 = second chunk with content, etc.)
- Sequence Number : Increments with each block written and stored in each chunk of a block (used to order blocks chronologically)
YAFFS2 Objects
A YAFFS2 Object (file, directory, etc.) consists of a header chunk, storing all metadata for the object, and zero or more data chunks. The spare area of each chunk will contain an object ID, sequence number, chunk ID, and file size, and possibly the type of object and the object ID of its parent (the type and parent object ID will also be in the data portion of the header chunk).
A YAFFS2 file system consists entirely of these objects - there is no master record of files or directory structure. The parent object ID field in each object is the only source for reconstructing the file hierarchy.
Basic YAFFS2 Operation
YAFFS2 is a log-structured file system that writes only once to each chunk. It does not use deletion markers; instead it stores enough information to reconstruct the chronological order of each chunk and from there use the most recent. The primary tool to do this is a sequence number stored in each chunk. This sequence number is incremented with each new block written, so that ordering blocks by sequence number will result in a chronological list regardless of where the blocks are in memory. Chunks are written sequentially within each block, so chunks early in a block are older than chunks that occur later.
For those not familiar with the workings of flash memory, an entire block is erased at a time. Once a chunk is written to, it cannot be changed without resetting the entire block that it belongs to. When the block is reset, it gets a new sequence number.
As an example, if we create a file temp.txt with 2 chunks worth of data, and then the first chunk of data is changed, we could see the following:
Sequence number | Offset | Object ID | Chunk ID | Notes |
1000 | 0x29400 | 500 | 0 | Object header containing file name "temp.txt" and other metadata |
1000 | 0x29c40 | 500 | 1 | First chunk of "temp.txt" |
1000 | 0x2a480 | 500 | 2 | Second chunk of "temp.txt" |
1000 | 0x2acc0 | 500 | 1 | First chunk of "temp.txt" |
The first version of chunk 1 is still there, but since we have a newer one it will now be ignored.
If after that we delete the file, it will get two new header blocks with the file named changed to "unlinked" or "deleted", the size set to zero, and the parent ID set to the unlinked or deleted folders.
Sequence number | Offset | Object ID | Chunk ID | Notes |
1000 | 0x29400 | 500 | 0 | Object header containing file name "temp.txt" and other metadata |
1000 | 0x29c40 | 500 | 1 | First chunk of "temp.txt" |
1000 | 0x2a480 | 500 | 2 | Second chunk of "temp.txt" |
1000 | 0x2acc0 | 500 | 1 | First chunk of "temp.txt" |
1006 | 0x02940 | 500 | 0 | Unlinked header |
1006 | 0x03180 | 500 | 0 | Deleted header |
Again, all the old data is still present (though at some point it may be garbage collected) but it will be ignored since we have a new header. Also note how the deleted header has a lower offset than the older data but a higher sequence number.
Yaffs2 TSK Configuration File
The Yaffs2 code in TSK uses the most common settings for page and spare size, and number of chunks per block, and attempts to detect the spare area offsets, but if this fails the user can create a configuration file. This configuration file should have the same name as the image (or the first segment of the image) followed by "-yaffs2.config" (MAY CHANGE).
Configuration File Format
The configuration file supports the following parameters:
- Flash layout-related (any combination of these can be present; missing values will go to defaults)
- flash_page_size (default 2048)
- flash_spare_size (default 64)
- flash_chunks_per_block (default 64)
- Spare layout-related (need either all three or none - auto-dectection routine will run if none are specified)
- spare_seq_num_offset
- spare_obj_id_offset
- spare_chunk_id_offset
See the Yaffs2 summary above or look in the references for more information on each field. Note that there is a fourth spare area field, nBytes, but it is not currently needed to load a Yaffs2 image so we omit it from the configuration file.
Sample Configuration File
# Yaffs2 config file spare_seq_num_offset = 12 spare_obj_id_offset = 16 spare_chunk_id_offset = 20 flash_page_size = 4096 flash_spare_size = 128
In this case, flash_chunks_per_block will go to the default value.