I maintain ZFS
support in Gentoo Linux and Sabayon's current ZFS
support is the result of a collaboration between myself and lxnay. micia asked me to comment here regarding ZFS
First, there is a notion of opportunity cost in economics that basically states that exploring one opportunity eliminates the ability to explore others. It therefore follows that it is logical to exercise the best opportunities and this applies to both decisions made by OS developers and by end users. People use Sabayon Linux because they trust its developers to make good technical decisions. If they did not, they would use something else. Sabayon had the opportunity to use btrfs instead of ZFS
. it required far less effort and much of the time spent on integration could have been spent on other things. Earlier this year, I offered to collaborate with lxnay on ZFS
support in Sabayon. He decided accept my offer despite losing time to work on other opportunities. So long as you trust lxnay to make good technical decisions, you can only believe that he decided to integrate ZFS
on the basis of technical superiority. There would be no reason for him to integrate ZFS
is unique in that it is a production filesystem that was designed with a strong focus on data integrity and has a solid track record for delivering it. No other filesystem has all of these properties. If you speak to btrfs developers/contributors in #btrfs on freenode, they will tell you that btrfs is NOT
production ready and that anyone using it should expect to lose data. That is what they told me this past week.
integrates the entire software stack (i.e. RAID, logical volume management, caching, IO management, the filesystem) to obtain performance and reliability capabilities that other filesystems cannot deliver. In particular, it has 4 key features that provide excellent performance:
- Alignment issues are eliminated by a combination of letting ZFS do partitioning and a concept called ashift, which dictates an IO size of which all IO operations are multiples. ashift=9 is 512-bytes. ashift=12 is 4096-bytes. ashift is set at pool creation.
- ZFS uses its own IO elevator when it manages the disks while leveraging the noop elevator, so that Linux does not second guess it. The CFQ IO Scheduler has been found to have some very bad corner cases where your system will literally hang on IO operations until they finish. In my own experience, it can hang for hours (when deleting millions of small files off ext4. ZFS' ability to avoid this ensures a minimum level of performance. This might not matter so much for Sabayon because it uses BFQ, which lacks that flaw.
- ZFS has a concept known as the ZIL (ZFS Intent Log). Synchronous operations require that data be flushed to storage before they can return. This process requires multiple IOs and it is slow. What the ZIL does is that it enables ZFS to sequentialize operations so that it can convert changes to a file system that require multiple IOs into a single sequential IO. ZFS can then tell the program that it finished, enabling quick IO completion times. It can then flush the changes recorded in the ZIL to the disk asynchronously. This is safe because in the event of a crash, the ZIL can be read to determine what needs to be flushed. It also has a significant effect on random write performance and it is possible to put the ZIL on a separate block device such a a solid state disk, making things even faster.
- ZFS uses a page replacement algorithm called ARC (Adaptive Replacement Cache). Typically operating system caches use a page replacement algorithm called LRU (Last Recently Used). It is quite simple in that the most recently accessed blocks on a disk are cached. This works well when you read the same data repeatedly, but if your workload varies (e.g. you are compiling software while using your desktop), it is easy for the kernel to evict things that you want to be cached, which causes random IO lags. The ARC algorithm was invented by researchers at IBM in 2002. It keeps track of pages that have been evicted from cache via a concept known as a ghost list, which stores recently evicted pages. If pages listed in the ghost list are accessed, then ZFS will make special effort to avoid evicting them again. This means that ARC will recongize your desktop usage patterns and avoid evicting things that you care about using because of background IO. This avoids IO lags that are endemic to other filesytems. ZFS also has something called L2ARC (literally, Level 2 ARC). It permits SSDs to be used to augment system memory for an even larger cache and an even better hit rate.
Four, there is a great deal of FUD about licensing, but it is rather simple. The CDDL and GPL licenses are both restrictive licenses and the combination of them causes problems for people wanting to use pieces of code exclusively available under one license with pieces of code exclusively available under the other in the same binary. In the case of the kernel, this prevents us from distributing ZFS
as part of the kernel binary. However, there is nothing in either license that prevents us from distributing it in the form of a binary module or in the form of source code. No one who has claimed otherwise has so far been able to find the conflicting provisions of the CDDL and GPL that prevent this form of usage.
With that said, the nature of the CDDL-GPL conflict means that many of the rules that apply to binary kernel modules also apply to ZFS
. Many Linux devices, including all Android devices, use binary kernel modules and it is quite commonly accepted that this is legal (although kernel developers hate it) so long as they are not part of the kernel binary. If this were not the case, it is quite probable that we would see lawsuits over this practice because everyone who has made code contributions to the version of Linux used in the device would be able to sue. A similar situation involving the Andrew Filesystem occurred 10 years ago where Linus Torvalds publicly stated that it was perfectly legal to distribute it as a kernel module:
http://linuxmafia.com/faq/Kernel/propri ... dules.html
I could say more on other topics, such as ZFS
' data integrity guarantees, its ease of administration, its integrated RAID or its various features that few filesystems provide, but I feel that I have adequately addressed kenoby's concerns.