BetrFS: Write-Optimization in a Kernel File System
Intervenant(s) : Don Porter (Université de Caroline du Nord/Chapel Hill)
Applications exhibit a mixture of I/O patterns, ranging from large,
sequential reads to small, random writes, yet general-purpose file
system designs trade good performance on some I/O patterns for poor
performance on others. For instance, ext4 is designed to update
data in place. Ext4 can issue sequential reads and writes at disk bandwidth,
while only realizing a small fraction of disk bandwidth for random
writes, as are commonly exhibited by applications such as SQLite, or
common IMAP email servers.
This talk describes BetrFS, an in-kernel file system for Linux that offers good
performance on all operations. First, BetrFS uses a data structure
called a B^e-tree to index on-disk data. A B^e-tree
eliminates the trade-off between small, random writes and large,
sequential scans. BetrFS also introduces techniques at the OS
and data structure level to smoothly navigate other tensions, such as
balancing large directory rename performance against maintaining
on-disk locality for efficient directory searches.
Compared to commodity file systems, BetrFS can improve workload
performance by up to two orders of magnitude, and generally matches other
file systems in the worst cases. For example, BetrFS improves
performance of the Dovecot IMAP server by up to 41% over
update-in-place file systems, such as ext4 or btrfs, and can improve
rsync performance by up to 31.5x.
More information about BetrFS, including source code,
is available at betrfs.org.
Gilles.Muller (at) null