NILFS is a unique file system that stands aside from traditional mainstream file systems that most people are familiar with.

Technical aspects of NILFS file system are beautifully explained by Neil Brown in his article A NILFS2 score card.

NILFS2 has much smaller code base than Btrfs. Unlike Btrfs NILFS2 is not overburdened with to-be-completed features.

NILFS2 design is elegant: as a "Log-structured file system" it is mostly a "log" or a "journal". NILFS2 uses copy-on-write by design so instead of overwriting data it adds changes to the end of "log".

Distinct feature of NILFS2 is automatic snapshotting. Every operation with the file system adds to the log and create another snapshot. Snapshots allow to access historical data to inspect and revert the changes if necessary.

Snapshots are useful for taking off-system backups as snapshots do not change during backup.

Free space

One of the confusing and inconvenient aspects of NILFS file system is difficulty to understand potential availability of free space. Traditional file systems accurately report the amount of free space.

NILFS reports only the amount of immediately available free space. For example if 50% of all files in NILFS partition are deleted, free space won't be released until nilfs_cleanerd process rewrite all remaining data which may take hours or days.

This is interesting: whatever change is introduced to NILFS, whenever renaming or deleting a file it will reduce the amount of immediately available free space.

Eventually nilfs_cleanerd will reclaim free space but it may happen much later so NILFS is under constant threat of running out of disk space.

Obviously NILFS is not suitable for database or any other IO demanding workflow because even constantly working nilfs_cleanerd may not be releasing free space fast enough. Constantly working nilfs_cleanerd should be expected to negatively affect performance. That's why nearly all NILFS benchmarks that don't have nilfs_cleanerd running in background are worthless because they demonstrate "ideal" performance rarely seen in practice.

One may ask how such file system may be useful? We believe there is a strong use case for it. Imagine a volatile, often changing but important data, like user's documents. Such data may require often backups. However backups have many challenges and limitations.

rsnapshot is a very effective tool to backup data. It allows to backup only changed data and create mirror hierarchy in file system for easy access to historical data.

However rsnapshot functionality comes at cost of many files and hard links. Having over 10 million files/hardlinks in rsnapshot tree is not unusual.

Each rsnapshot invocation takes a lot of time and resources when amount of data to backup is significant. Performance degradation during rsnapshotting (or any backaping) is quite noticeable.

Because it is difficult to predict how much space will be occupied by rsnapshot tree it is hard to determine how many backups can fit to partition and how often backups can be taken. There is always risk to loose hours of work if data was lost or accidentally deleted after last backup.

All those challenges do not exist on NILFS: all free space above configured threshold is effectively utilised by automatic checkpoints. All changes to file system create another checkpoint. Checkpoints are cheap and their creation take no overhead comparing to run of rsnapshot.

Automatic checkpoints make NILFS a very forgiving file system — a valuable feature when human factor is involved. Considering the above NILFS may be a perfect file system for "home" partitions.

Automatic maintenance.

Besides nilfs_cleanerd running in background every time when immediately available free space drops below configured threshold, NILFS requires no additional maintenance. All traditional file systems including new Btrfs are suffering from fragmentation and benefit from periodic defragmentation which is up to admin to perform.

Problems

nilfs_cleanerd will never stop on nearly full file system when there is no way to reclaim free space up to configured threshold.

nilfs_cleanerd can not continue if interrupted. It will start all over if computer is rebooted or file system re-mounted. Practically if it takes 3 days to rewrite data in NILFS partition to reclaim free space, restarting will add three more days until free space will be available. NILFS is best for continuous (uninterrupted) operations on stable hardware.

It is hard to understand how much free space is potentially available. For example when many files were just deleted from NILFS partition the amount of potential free space (max_free_space - size_of_all_files) is not advertised so it is hard to determine if enough data was deleted to clear space for something to be copied to partition.

It may be difficult to estimate the safe amount of free space in NILFS partition. Different work flows may consume free space at various rate but in order to avoid running out of free space there should be enough free space to buy time needed by nilfs_cleanerd to release more free space. Depending on how much data is in NILFS partition, in worst case nilfs_cleanerd may need to rewrite all of it. For example if it takes 3 days then you may see free space released after deleting some files only 3 days later.

Reliability

Incidentally the lab computer that we used to test NILFS2 had a hardware problem manifesting itself by random hang-ups. Initially the cause of this problem was not clear and during investigation system hanged without leaving any clues in logs for over 60 times. All the time NILFS was actively used and most of the time nilfs_cleanerd was running. After each reboot we verified integrity of all the data. Remarkably not a single corruption or data loss was found.

There is no fsck utility available for NILFS. It's been argued that such utility could be useful because there are still some fixed areas used by file system meta data. However log-structured nature of NILFS is remarkably effective with recovery after failure. It takes longer to mount NILFS after crash but the file system is always consistent and discard only last unfinished operation(s).

Generally NILFS is good enough on Linux kernel starting from version 3.0 (we recommend 3.2 and newer).


Testing was conducted on Debian x86_64 GNU/Linux_3.0

See also