stefan-gloor.ch

Implementing FAT32 for the Barrelfish Microkernel

As part of an operating systems class at university, we built our own, customized version of the Barrelfish microkernel/multikernel operating system running on an i.MX8X processor. In a group effort, we implemented the basic OS internals, such as physical memory management, paging, multiprocessing and inter-process communication. As an individual project I implemented a user space FAT32 file system driver, which would allow our embedded system to interact with the SDHC card.

FAT32

FAT32 was initially designed by Microsoft, but has been adopted by numerous other types of system and has become somewhat of a de facto standard file system due to its simplicity and ease of use. However, FAT32 has some shortcomings compared to more modern and more advanced file system implementations. FAT32 is not journaled, meaning that sudden power loss (a problem for mobile and embedded devices) in an unlucky moment might lead to data corruption. Further, there is not native or extensive support for long file names or permissions, but this can be “fixed” by using (sometimes system-specific, or patented) extensions, such as VFAT.

A FAT32 image consists of several important parts: A boot sector containing metadata of the filesystem, such as the block and cluster sizes, and the index of the root data cluster; The file allocation table (FAT) with optional backup copies, and data blocks. The FAT is an out-of-band metadata storage for allocating and ordering the data blocks. It acts as a map of clusters, where a cluster is a group with a fixed number of disk blocks. Each cluster entry (a 4 byte index) in the FAT represents one data cluster (e.g., 4 kiB). For files longer than one cluster, the cluster entry contains the index of the next cluster, forming a chain which is continued until an “end-of-chain” marker is reached.

diagram showing how FAT entries map to clusters
Diagram showing an example cluster traversal. The boot record indicates where the root cluster is (usually at index 2). From there, the FAT is traversed from entry to entry until an “end-of-chain” marker is encountered. The numbers in the FAT represent the current cluster index, whereas the arrow depicts the value of that entry, i.e., the index of the next cluster. In this example, the root directory needs to be pieced together by concatenating data blocks 2, 7, 36 and 74.

Directories are represented by special files which have a defined structure in their data clusters. Each file entry in such a structure points to the start cluster of that particular file or subdirectory.

OS Integration

As we worked on a microkernel system, I implemented the file system driver as a user space library, with the low-level SDHC block driver residing in its own process. Because our operating system did not support IPC channels among arbitrary processes, but only to the init process, the file system first contacts init, which then relays the message further to the SDHC block driver.

IPC diagram of the filesystem communicating with the block driver through init
Diagram showing the IPC messages during initialization and the first block read. The process using the filesystem (e.g., shell) polls the init process to check whether the block driver is available. Once the block driver has registered itself with init, init will notify the file system library which will then start to issue requests. These requests are relayed to the SDHC block driver as responses to a blocking “listen” call. This way, init never issues blocking calls, but only receives or directly answers IPC messages.

The file system was registered with the libc functions and integrated with the shell process, allowing the file system to be used with familiar commands like ls, cat etc.