Partition questions, basically the best performance

If you are new to Linux or new to Sabayon Linux and just not sure where to post, here ya go. Post without fear of being told to RTFM :-)

Moderator: Moderators

micia
Sagely Hen
Posts: 2718
Joined: Wed Nov 26, 2008 16:41
Contact:

Re: Partition questions, basically the best performance

Post by micia » Fri Oct 19, 2012 18:00

Well, ext4 and ZFS can also handle SSD very well, especially ZFS since it has the capability to store files in RAM in a transparent way, this should improve your disk lifetime since SSD are sensible to frequent writes.
Quoting wikipedia:
ZFS uses different layers of disk cache to speed up read and write operations. Ideally, all data should be stored in RAM but that is too expensive. Therefore, the data is automatically cached in a hierarchy to optimize performance vs cost. Frequently accessed data is stored in RAM, and less frequently accessed data can be stored on slower media, such as SSD disks. Data that is not often accessed, is not cached and left on the slow hard drives. If old data is suddenly read a lot, ZFS will automatically move it to SSD disks or to RAM.

kenoby
Simple Hen
Posts: 52
Joined: Thu Jul 02, 2009 6:20

Re: Partition questions, basically the best performance

Post by kenoby » Fri Oct 19, 2012 18:16

micia wrote:Well, ext4 and ZFS can also handle SSD very well, especially ZFS since it has the capability to store files in RAM in a transparent way, this should improve your disk lifetime since SSD are sensible to frequent writes.


Point taken. Seems ZFS is deeply entrenched in Sabayon OS as is, so it should be a non issue to try it out.
I'm considering to tar / and untar on reformatted partition, would that be ok? Once I catch enough free time that is.

I'm not familiar with licensing of ZFS, but are there, or potential to be, any issues as its under Oracle now?

ryao
Baby Hen
Posts: 4
Joined: Mon Sep 24, 2012 19:19

Re: Partition questions, basically the best performance

Post by ryao » Sat Oct 20, 2012 1:48

I maintain ZFS support in Gentoo Linux and Sabayon's current ZFS support is the result of a collaboration between myself and lxnay. micia asked me to comment here regarding ZFS.

First, there is a notion of opportunity cost in economics that basically states that exploring one opportunity eliminates the ability to explore others. It therefore follows that it is logical to exercise the best opportunities and this applies to both decisions made by OS developers and by end users. People use Sabayon Linux because they trust its developers to make good technical decisions. If they did not, they would use something else. Sabayon had the opportunity to use btrfs instead of ZFS. it required far less effort and much of the time spent on integration could have been spent on other things. Earlier this year, I offered to collaborate with lxnay on ZFS support in Sabayon. He decided accept my offer despite losing time to work on other opportunities. So long as you trust lxnay to make good technical decisions, you can only believe that he decided to integrate ZFS on the basis of technical superiority. There would be no reason for him to integrate ZFS otherwise.

Second, ZFS is unique in that it is a production filesystem that was designed with a strong focus on data integrity and has a solid track record for delivering it. No other filesystem has all of these properties. If you speak to btrfs developers/contributors in #btrfs on freenode, they will tell you that btrfs is NOT production ready and that anyone using it should expect to lose data. That is what they told me this past week.

Third, ZFS integrates the entire software stack (i.e. RAID, logical volume management, caching, IO management, the filesystem) to obtain performance and reliability capabilities that other filesystems cannot deliver. In particular, it has 4 key features that provide excellent performance:

  1. Alignment issues are eliminated by a combination of letting ZFS do partitioning and a concept called ashift, which dictates an IO size of which all IO operations are multiples. ashift=9 is 512-bytes. ashift=12 is 4096-bytes. ashift is set at pool creation.
  2. ZFS uses its own IO elevator when it manages the disks while leveraging the noop elevator, so that Linux does not second guess it. The CFQ IO Scheduler has been found to have some very bad corner cases where your system will literally hang on IO operations until they finish. In my own experience, it can hang for hours (when deleting millions of small files off ext4. ZFS' ability to avoid this ensures a minimum level of performance. This might not matter so much for Sabayon because it uses BFQ, which lacks that flaw.
  3. ZFS has a concept known as the ZIL (ZFS Intent Log). Synchronous operations require that data be flushed to storage before they can return. This process requires multiple IOs and it is slow. What the ZIL does is that it enables ZFS to sequentialize operations so that it can convert changes to a file system that require multiple IOs into a single sequential IO. ZFS can then tell the program that it finished, enabling quick IO completion times. It can then flush the changes recorded in the ZIL to the disk asynchronously. This is safe because in the event of a crash, the ZIL can be read to determine what needs to be flushed. It also has a significant effect on random write performance and it is possible to put the ZIL on a separate block device such a a solid state disk, making things even faster.
  4. ZFS uses a page replacement algorithm called ARC (Adaptive Replacement Cache). Typically operating system caches use a page replacement algorithm called LRU (Last Recently Used). It is quite simple in that the most recently accessed blocks on a disk are cached. This works well when you read the same data repeatedly, but if your workload varies (e.g. you are compiling software while using your desktop), it is easy for the kernel to evict things that you want to be cached, which causes random IO lags. The ARC algorithm was invented by researchers at IBM in 2002. It keeps track of pages that have been evicted from cache via a concept known as a ghost list, which stores recently evicted pages. If pages listed in the ghost list are accessed, then ZFS will make special effort to avoid evicting them again. This means that ARC will recongize your desktop usage patterns and avoid evicting things that you care about using because of background IO. This avoids IO lags that are endemic to other filesytems. ZFS also has something called L2ARC (literally, Level 2 ARC). It permits SSDs to be used to augment system memory for an even larger cache and an even better hit rate.

Four, there is a great deal of FUD about licensing, but it is rather simple. The CDDL and GPL licenses are both restrictive licenses and the combination of them causes problems for people wanting to use pieces of code exclusively available under one license with pieces of code exclusively available under the other in the same binary. In the case of the kernel, this prevents us from distributing ZFS as part of the kernel binary. However, there is nothing in either license that prevents us from distributing it in the form of a binary module or in the form of source code. No one who has claimed otherwise has so far been able to find the conflicting provisions of the CDDL and GPL that prevent this form of usage.

With that said, the nature of the CDDL-GPL conflict means that many of the rules that apply to binary kernel modules also apply to ZFS. Many Linux devices, including all Android devices, use binary kernel modules and it is quite commonly accepted that this is legal (although kernel developers hate it) so long as they are not part of the kernel binary. If this were not the case, it is quite probable that we would see lawsuits over this practice because everyone who has made code contributions to the version of Linux used in the device would be able to sue. A similar situation involving the Andrew Filesystem occurred 10 years ago where Linus Torvalds publicly stated that it was perfectly legal to distribute it as a kernel module:

http://linuxmafia.com/faq/Kernel/propri ... dules.html

I could say more on other topics, such as ZFS' data integrity guarantees, its ease of administration, its integrated RAID or its various features that few filesystems provide, but I feel that I have adequately addressed kenoby's concerns.

micia
Sagely Hen
Posts: 2718
Joined: Wed Nov 26, 2008 16:41
Contact:

Re: Partition questions, basically the best performance

Post by micia » Sat Oct 20, 2012 2:37

I'd like to thank ryao for the very informative post, it surely addressed some of my own concerns regarding ZFS.
It was a very pleasant read.

kenoby
Simple Hen
Posts: 52
Joined: Thu Jul 02, 2009 6:20

Re: Partition questions, basically the best performance

Post by kenoby » Sat Oct 20, 2012 17:05

ryao wrote:I maintain ZFS support in Gentoo Linux and Sabayon's current ZFS support is the result of a collaboration between myself and lxnay. micia asked me to comment here regarding ZFS.

\/snip

I could say more on other topics, such as ZFS' data integrity guarantees, its ease of administration, its integrated RAID or its various features that few filesystems provide, but I feel that I have adequately addressed kenoby's concerns.


Well, color me impressed :P Your post was a joy to read and it cleared out more than asked for. Bookmarked for future reference. I will proceed to test it once I get enough time.


Thank you for taking time to write it and Micia for asking the right person.

Ad nauseam, and I hope its not too much to ask you guys, what is the preffered way of backing up and copying over new filesystem and what would be best way to partition the drive? As mentioned, standard for me is home and root on separate partitions but I'd like to make sure before setting the SSD up and to utilize the fs correctly.

ryao
Baby Hen
Posts: 4
Joined: Mon Sep 24, 2012 19:19

Re: Partition questions, basically the best performance

Post by ryao » Sun Oct 21, 2012 13:22

kenoby wrote:
ryao wrote:I maintain ZFS support in Gentoo Linux and Sabayon's current ZFS support is the result of a collaboration between myself and lxnay. micia asked me to comment here regarding ZFS.

\/snip

I could say more on other topics, such as ZFS' data integrity guarantees, its ease of administration, its integrated RAID or its various features that few filesystems provide, but I feel that I have adequately addressed kenoby's concerns.


Well, color me impressed :P Your post was a joy to read and it cleared out more than asked for. Bookmarked for future reference. I will proceed to test it once I get enough time.


Thank you for taking time to write it and Micia for asking the right person.

Ad nauseam, and I hope its not too much to ask you guys, what is the preffered way of backing up and copying over new filesystem and what would be best way to partition the drive? As mentioned, standard for me is home and root on separate partitions but I'd like to make sure before setting the SSD up and to utilize the fs correctly.


That would be called a stage4 tarball in Gentoo. The unofficial wiki has documentation on it:

http://en.gentoo-wiki.com/wiki/Custom_Stage4

Then you can use it to reinstall your Sabayon Linux installation as if you were doing a Gentoo installation on ZFS. I maintain some notes on this procedure at the following web page:

https://github.com/ryao/zfs-overlay/blo ... fs-install

Note that I haven't actually done this with Sabayon, but it should work with some minor changes.

  1. Use equo to install software instead of emerge
  2. Skip lines 50 through 64 (i.e. everything from emerge --sync to emerge sys-fs/zfs)
  3. Skip lines 92 and 93
  4. Skip lines 98 through 102

I imagine that you will want to remove the lvm init scripts from OpenRC's run levels during the installation. I believe that micia is planning to try this, so he should be able to provide more details.

micia
Sagely Hen
Posts: 2718
Joined: Wed Nov 26, 2008 16:41
Contact:

Re: Partition questions, basically the best performance

Post by micia » Sun Oct 21, 2012 13:30

ryao wrote:I believe that micia is planning to try this, so he should be able to provide more details.


Indeed, I am experimenting with ZFS right now on my system. :)
Thank you for the links, I am sure they are going to be very useful.

It could take some time to get it working, but if I'll be successful I am going to share the results here.

kenoby
Simple Hen
Posts: 52
Joined: Thu Jul 02, 2009 6:20

Re: Partition questions, basically the best performance

Post by kenoby » Tue Oct 23, 2012 16:36

Many thanks ryao, I will give it a go and report back as soon as I get enough time.

Post Reply