The idea of using SSD disk to accelerate file system on rotational media was around for some time already.

However using SSD for caching may not require complicated solutions like flashcache and bcache that we tried earlier.

Among all mainstream file systems only ext4 on rotational media can be accelerated using SSD disk safely and effectively. Its full performance potential can be unleashed by using a unique feature — external journal. Having said this it is worth noticing that xfs also can use external journal but only in theory because xfs can not be mounted if external journal device is lost neither it can be reconfigured to convert journal back to local or to change its size.

First step to optimise ext4 performance is to use journal_async_commit mount option. This will automatically enable journal_checksum feature as well.

journal_async_commit alone gives nice performance boost and it is essential to use it with external journal.

Test: iozone (SYNC,O_DIRECT)

Iozone: Performance Test of File I/O
Version $Revision: 3.397 $
Compiled for 64 bit mode.
Build: linux-AMD64

Machine = Linux 3.7-trunk-amd64 #1 SMP Debian 3.7.3-1~experimental.1 x  SYNC Mode.
SYNC Mode.
Include close in write timing
Include fsync in write timing
O_DIRECT feature enabled
File size set to 4194304 KB
Record Size 256 KB
Command line used: iozone -M -o -c -e -t3 -T -I -s 4g -r 256k -i0 -i2 -i6 -i8 -i9 -i11
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 3 threads
Each thread writes a 4194304 Kbyte file in 256 Kbyte records

The table below shows results of tests where unbuffered writing was performed using direct access (O_DIRECT) and fsync() after every operation.

iozone -M -o -c -e -t3 -T -I -s 4g -r 256k -i0 -i2 -i6 -i8 -i9 -i11
file system effective mount options Test execution time (less is better)
ext4 rw,relatime,journal_checksum,data=ordered
125min. (sys/37s)
xfs rw,relatime,attr2,inode64,noquota
109min. (sys/35s)
btrfs rw,relatime,space_cache
96min. (sys/2:45s)
btrfs rw,relatime,compress=lzo,space_cache
96min. (sys/2:46s)
btrfs rw,relatime,compress=lzo,space_cache,autodefrag
95min. (sys/2:45s)
ext4 rw,relatime,journal_checksum,journal_async_commit,data=ordered
95min. (sys/44s)
nilfs2* rw,relatime
57min. (sys/2:25s)
ext4 (external journal_data) rw,relatime,nodelalloc,journal_checksum,journal_async_commit
47min. (sys/2:43s)
ext4 (external journal_data_ordered) rw,relatime,journal_checksum,journal_async_commit
38min. (sys/34s)

* Don't trust NILFS2 benchmarks. This test was run on freshly-created file system. In real-world scenario the result probably will be at least 2 times worse due to background cleaning.

The first line (slowest) is a default ext4 behaviour (journal_checksum does not have measurable effect on performance).

xfs and btrfs are given for comparison.

journal_async_commit gives nice performance boost and puts ext4 ahead of xfs.

The most interesting ext4 benchmarks follow cleanerless nilfs2 measurement.

The last two measurements are taken with ext4 external journal placed to 1906 MiB (3903488 sectors) partition on SSD disk.

journal_data mode first saves all data to journal then flushes it to main rotational device. Basically all data is written twice, first to journal and then to file system — that's why system time is 5 times greater than in last test where journal_data_ordered mode was used.

The very last result demonstrates fantastic performance boost — over 3 times faster than ext4 default (first test).

Conclusion

Full ext4 performance potential can be unleashed with the careful use of external journal. This is safe and effective method utilising no additional software and achieving performance comparable to less reliable SSD caching solutions.

See also

Visualisation of ext4 external journal activity during test (journal_data_ordered, blktrace+seekwatcher).

[Poll]

Useful (100%)


Worthless (0%)


Total votes: 4