NILFS is a unique file system that stands aside from traditional mainstream file systems that most people are familiar with.
Technical aspects of NILFS file system are beautifully explained by Neil Brown in his article A NILFS2 score card.
NILFS2 has much smaller code base than Btrfs. Unlike Btrfs NILFS2 is not overburdened with to-be-completed features.
NILFS2 design is elegant: as a "Log-structured file system" it is mostly a "log" or a "journal". NILFS2 uses copy-on-write by design so instead of overwriting data it adds changes to the end of "log".
Distinct feature of NILFS2 is automatic snapshotting. Every operation with the file system adds to the log and create another snapshot. Snapshots allow to access historical data to inspect and revert the changes if necessary.
Snapshots are useful for taking off-system backups as snapshots do not change during backup.
Free space
One of the confusing and inconvenient aspects of NILFS file system is difficulty to understand potential availability of free space. Traditional file systems accurately report the amount of free space.
NILFS reports only the amount of immediately available free space. For
example if 50% of all files in NILFS partition are deleted, free space
won't be released until nilfs_cleanerd
process rewrite all remaining
data which may take hours or days.
This is interesting: whatever change is introduced to NILFS, whenever renaming or deleting a file it will reduce the amount of immediately available free space.
Eventually nilfs_cleanerd
will reclaim free space but it may happen
much later so NILFS is under constant threat of running out of disk space.
Obviously NILFS is not suitable for database or any other IO demanding
workflow because even constantly working nilfs_cleanerd
may not be
releasing free space fast enough. Constantly working nilfs_cleanerd
should be expected to negatively affect performance. That's why nearly
all NILFS benchmarks that don't have nilfs_cleanerd
running in
background are worthless because they demonstrate "ideal" performance
rarely seen in practice.
One may ask how such file system may be useful? We believe there is a strong use case for it. Imagine a volatile, often changing but important data, like user's documents. Such data may require often backups. However backups have many challenges and limitations.
rsnapshot
is a very effective tool to backup data.
It allows to backup only changed data and create mirror
hierarchy in file system for easy access to historical data.
However rsnapshot
functionality comes at cost of many files and hard links.
Having over 10 million files/hardlinks in rsnapshot
tree is not unusual.
Each rsnapshot
invocation takes a lot of time and resources when
amount of data to backup is significant. Performance degradation during
rsnapshotting (or any backaping) is quite noticeable.
Because it is difficult to predict how much space will be occupied by rsnapshot tree it is hard to determine how many backups can fit to partition and how often backups can be taken. There is always risk to loose hours of work if data was lost or accidentally deleted after last backup.
All those challenges do not exist on NILFS: all free space above
configured threshold is effectively utilised by automatic checkpoints.
All changes to file system create another checkpoint. Checkpoints are
cheap and their creation take no overhead comparing to run of
rsnapshot
.
Automatic checkpoints make NILFS a very forgiving file system — a valuable feature when human factor is involved. Considering the above NILFS may be a perfect file system for "home" partitions.
Automatic maintenance.
Besides nilfs_cleanerd
running in background every time when
immediately available free space drops below configured threshold, NILFS
requires no additional maintenance. All traditional file systems
including new Btrfs are suffering from fragmentation and benefit from
periodic defragmentation which is up to admin to perform.
Problems
nilfs_cleanerd
will never stop on nearly full file system when there
is no way to reclaim free space up to configured threshold.
nilfs_cleanerd
can not continue if interrupted. It will start all over
if computer is rebooted or file system re-mounted. Practically if it
takes 3 days to rewrite data in NILFS partition to reclaim free space,
restarting will add three more days until free space will be available.
NILFS is best for continuous (uninterrupted) operations on stable hardware.
It is hard to understand how much free space is potentially available. For example when many files were just deleted from NILFS partition the amount of potential free space (max_free_space - size_of_all_files) is not advertised so it is hard to determine if enough data was deleted to clear space for something to be copied to partition.
It may be difficult to estimate the safe amount of free space in NILFS
partition. Different work flows may consume free space at various rate
but in order to avoid running out of free space there should be enough
free space to buy time needed by nilfs_cleanerd
to release more free
space. Depending on how much data is in NILFS partition, in worst case
nilfs_cleanerd
may need to rewrite all of it. For example if it takes
3 days then you may see free space released after deleting some files
only 3 days later.
Reliability
Incidentally the lab computer that we used to test NILFS2 had a
hardware problem manifesting itself by random hang-ups.
Initially the cause of this problem was not clear and during
investigation system hanged without leaving any clues in logs for over
60 times. All the time NILFS was actively used and most of the time
nilfs_cleanerd
was running. After each reboot we verified integrity of
all the data. Remarkably not a single corruption or data loss was found.
There is no fsck
utility available for NILFS. It's been argued that
such utility could be useful because there are still some fixed areas
used by file system meta data. However log-structured nature of NILFS is
remarkably effective with recovery after failure. It takes longer to
mount NILFS after crash but the file system is always consistent and
discard only last unfinished operation(s).
Generally NILFS is good enough on Linux kernel starting from version 3.0 (we recommend 3.2 and newer).
Testing was conducted on Debian x86_64 GNU/Linux_3.0