Advantages and benefits of ZFS and Btrfs over ext4

Under development: PCMCIA, wireless, etc.
Post Reply
Message
Author
labbe5
Posts: 2159
Joined: Wed 13 Nov 2013, 14:26
Location: Canada

Advantages and benefits of ZFS and Btrfs over ext4

#1 Post by labbe5 »

http://distrowatch.com/weekly.php?issue=20130415#qa

Jesse Smith, from Distrowatch, is explaining the benefit of using ZFS and/or Btrfs file systems.

Why-go-advanced asks: How about a simple, layman explanation for what use a normal user would have for advanced file systems. Is there a huge benefit of either of these (Btrfs and ZFS) over ext4 for a regular user?

DistroWatch answers: Yes, there are certainly many advantages for regular users who are interested in switching from a traditional file system such as ext4 and moving to either Btrfs or ZFS. I'm going to focus mostly on ZFS here as it is the technology I've used the most and therefore I'm more familiar with it, but much of what I say about ZFS will be applicable to Btrfs as well.

The first and perhaps most obvious advantage is the ease of setting up these advanced file systems. When you install Linux on a traditional file system or when you add a new partition to an existing installation, what are the steps? We have to create a partition, we need to format that partition and then we need to assign the partition a mount point, probably by adding an entry to our system's fstab file. If we are lucky our distribution's installer will take care of a lot of this for us, but we still need to divide up the hard disk, select the size of the new partition, format it and select its mount point. ZFS makes this wonderfully easy. Adding ZFS storage space (called a pool) to our operating system is often a one step event. We tell ZFS to take over a hard disk (or an existing partition) and it takes care of the formating and mounting. We don't have to format anything, in many cases we don't need to partition anything, ZFS just takes care of it for us. Recently I added a ZFS storage pool to one of my systems and the command was simply this:
zpool create Data /dev/sdb
Given this command ZFS created a new storage pool using the second hard disk on my machine, handled any formatting it might need, created a new directory called /Data and mounted the new storage space under my new Data directory. When I rebooted the machine the new storage space was automatically mounted and available for me. It's very convenient this way.

In addition to being easy to set up, advanced file systems are quite flexible. Once we have created an ext4 partition we are pretty much stuck with it as it is with a given size. But with ZFS I can easily add additional disks to a given storage pool, which dynamically grows the available storage space. Let's say I have several users on my machine who are all using my storage space mounted under /Data. I want to grow the space without taking the system off-line for a long period of time and I don't really want to have to copy all of the data from an existing disk or partition to a new disk. I can do this easily with ZFS by plugging in a new disk and running:
zpool add Data /dev/sdc
The storage pool has now expanded to use all of the new drive and all of its space is available under the existing mount point, /Data.

Another big advantage to using Btrfs and ZFS is the ability to make snapshots. At any given moment we can create a copy of the existing file system and set it aside. These snapshots occur instantly and do not use up additional disk space until the contents of the snapshot differ from the current contents of the file system. This does two things for us. First, it makes it very cheap and easy to maintain multiple versions of data, configuration files and applications. Prior to any application upgrade I can make a snapshot of the operating system. Once a day I can snapshot all of the documents my users have in their home folders. Later I can restore the file system back to a known good state. Alternatively I can browse through existing snapshots of the file system and restore a single file or directory. This is very handy if we have accidentally erased a file or a file has become corrupted. This reduces our reliance on external backups. Keeping backups is still important as it guards against hardware failure, but when we run advanced file systems accidentally deleting a file is easy to reverse and doesn't send us digging through archives.

Btrfs and ZFS are both designed with extremely large amounts of data in mind. This means we can grow storage pools to virtually any size and store massively large files in these file systems. In addition both storage technologies attempt to use space efficiently. Both file systems support compression of data to squeeze as much information as possible onto our disks. Further ZFS has (and Btrfs is developing) a concept called deduplication. This basically means that multiple files which contain the same data only need to be stored in one place. Let's imagine we somehow ended up with three copies of a 1GB file on our hard drive. Usually this would mean all three copies take up a total of 3GB of space. With deduplication all three copies can be treated as one file which is simply visible in three different places. Therefore the three identical 1GB files require just 1GB of storage space.

Though probably only of interest to administrators there are some more nice features. ZFS in particular makes it very easy to create mirrored disk configurations. This basically means that any data placed on one disk is also placed on a second disk. Should one disk fail, our information is safe on the second disk. Systems which make use of mirroring or RAID configurations can get an added bonus from ZFS, namely data integrity. It is possible for files to become corrupted over time and ZFS tries to guard against this by maintaining checksums (a digital fingerprint) of our data. When a file's data no longer matches its fingerprint, ZFS will automatically try to find a second copy of our file on a mirrored disk and use that second copy. The corrupted copy of our file is then overwritten by the good copy, preserving our data against corruption.

Getting back to the original question, is there a "huge benefit" to using Btrfs or ZFS over a file system such as ext4? Perhaps not one single big reason that will drive people to migrate, but there are several small benefits to using ZFS or Btrfs. Many of these benefits will appeal to system administrators and people who have massive amounts of data, but snapshots and the ease of adding additional storage space do make these advanced file systems appealing to home users too. Perhaps the question could be turned around. Given the many benefits of running Btrfs and ZFS is there any reason for people to still use ext4? The only perk to using ext4 of which I am aware is that ext4, under heavy load, will probably read from and write to a hard disk faster than Btrfs and ZFS can. Still, most of us don't require raw speed as much as we need data integrity and the ability to browse backward in time to earlier snapshots of our data. This is why I believe it makes sense to try (and possibly migrate to) one of the more advanced file systems. I have been using ZFS on Linux during the past year at home and have found it to be a welcome and reliable tool.

User avatar
LazY Puppy
Posts: 1934
Joined: Fri 21 Nov 2014, 18:14
Location: Germany

#2 Post by LazY Puppy »

Therefore the three identical 1GB files require just 1GB of storage space.
Doesn't this mean, if some of the data of this 1GB storage is corrupted, all three files will be corrupted?

How is this different to a symbolic link?

What about: if you don't have a physical copy (a real Backup) does mean not to have a file/data at all?
RSH

"you only wanted to work your Puppies in German", "you are a separatist in that you want Germany to secede from Europe" (musher0) :lol:

No, but I gave my old drum kit away for free to a music store collecting instruments for refugees! :wink:

Post Reply