What is a file system and what are they for?
All operating systems incorporate a file system, or several different file systems, to control how information is stored and retrieved from different media, such as hard drives, SSD drives and also removable storage units such as pen drives or memory cards. If we did not have a file system, the operating system would not know where a certain recorded data ends and where the next one begins, therefore, it is one of the most important aspects that we must take into account.
The main functions of any file system is to allocate space to the different files, manage free space, structure the saved information so that it is accessible quickly and easily. Another very important aspect that we must take into account are the sectors, more specifically their size, in these sectors is where the information is stored. Other characteristics of the file systems are that they provide methods to create, copy, move, rename and even delete the files and directories that we have on the media. File systems also incorporate some very important features such as access control lists (ACLs) to control permissions, mechanisms to avoid or mitigate fragmentation, the possibility of having journaling (improves the integrity of the file system) and the possibility of configuring disk quotas among other functionalities.
Currently we have a total of three different file systems that are widely used in NAS servers of different manufacturers, and of course in Linux and FreeBSD based operating systems for data storage on servers, these file systems are EXT4 , Btrfs and ZFS, the three file systems have different characteristics, and some perform better in different scenarios.
EXT4
EXT4 is the main file system of any Linux-based operating system, this file system is transactional (with journaling), and incorporates very important improvements compared to its predecessors such as support for larger volumes, lower CPU usage and improvements in the read and write speed. Some very important characteristics of EXT4 is that it allows the reservation of disk space without the need to fill everything with zeros, something that was usually done in other file systems, in addition, this reserved space is usually contiguous in order to avoid or mitigate the fragmentation of the file system, this function is related to “Allocate-on-flush” or known as delayed memory reservation, this consists of reserving the memory block just when it is about to be written to disk, this improves performance and reduces fragmentation.
EXT4 has techniques to avoid fragmentation like the one we have explained, but it also has a tool to defragment individual files or the entire volume without having to unmount the disk, although logically, while it is in the process of defragmentation we will have a more file system slow.
This file system has the following characteristics and limits:
- Maximum file size: 16TiB using 4K blocks.
- Maximum number of files: 4 billion
- Maximum file name size: 255 bytes
- Maximum volume size: 1EiB
- Transparent data encryption: yes
- Copy on write: no
- Transparent compression: no
- Transparent deduplication: no
Once we have known the main features of EXT4, we are going to talk about Btrfs, which is known as the natural successor to the EXT4 file system.
Btrfs
The Btrfs file system was born as a natural successor to EXT4, its objective is to replace it by eliminating the greatest number of its limitations, especially what refers to the maximum size of the files. The main characteristics of this file system is that it is oriented above all to servers, it has a dynamic allocation of inodes, it is not necessary to set a maximum number when creating the file system as is the case with EXT4, it allows you to configure volumes in a very advanced, with the ability to configure snapshots or writable snapshots and also allows snapshots of snapshots. Other features are that it allows mirroring and striping at the target level if we have several hard drives, it is capable of verifying data and metadata in real time to maximize data integrity.
This file system uses registry copy-on-write of all data and metadata, it also allows inline compression to save disk space. Btrfs is able to check the file system without having to unmount it, and if we unmount it, the check is really fast, of course, it has an optimized mode for SSD drives and allows it to be defragmented without unmounting it.
This file system has the following characteristics and limits:
- Maximum file size: 16EiB.
- Maximum number of files: 18 trillion.
- Maximum file name size: 255 bytes
- Maximum volume size: 16EiB.
- Transparent data encryption: no
- Copy on write: yes
- Transparent compression: yes
- Deduplication: yes
Now that you know the main features of Btrfs, let’s talk about ZFS, one of the most advanced file systems available for Linux and Unix operating systems.
ZFS
The ZFS file system is one of the most advanced that currently exists, it stands out for its great capacity, for its great security regarding data integrity and for its great reading and writing performance. ZFS uses “Storage Pools” or also known as vdevs, unlike traditional file systems that are placed on top of a hardware device such as a hard disk, and therefore require a separate volume manager. Thanks to these vdevs, we can configure different “pools” of simple type, mirror or use the popular RAID-Z to provide both data redundancy and higher performance. In addition, the ZFS file system can be fitted with SSD drives that will act as a cache, or also known as “ZFS Intent Log” or “ZIL” to further improve performance.
This file system makes use of a copy-on-write transactional model, this means that the active data is never overwritten, but is copied to another place and the modified data is written to it, with the aim of greatly improving the file integrity in the event of a power outage. We must also bear in mind that to reduce the overhead the ZIL is used when synchronous writes are needed. The negative part of CoW is that we will have a high fragmentation, and currently ZFS has no method to defragment the file system, although in the next versions we are working on improving this aspect.
Other characteristics of ZFS is that the file system is called a dataset, which is inside a «storage pools», this dataset can be of the filesystem type which is like a normal and current volume, or also like a zvol that would be a device of blocks. Depending on our needs, we will have to create one type of dataset or another. Another very important feature of datasets are snapshots or also known as snapshots, the ZFS file system is capable of taking a total of 281 billion snapshots, in addition, they are made in real time because of how this file system is created internally.
This file system allows us to configure compression online, making use of different compression algorithms to improve reading and writing speed or to improve compression and save more space. We can also configure the deduplication function, a feature that will allow us to save a lot of disk space, the negative part is that it consumes a large amount of RAM, so you may not be too interested in activating this functionality.
This file system has the following characteristics and limits:
- Maximum file size: 16EiB.
- Maximum number of files: 281 billion.
- Maximum file name size: 255 bytes
- Maximum volume size: 16EiB.
- Data encryption: yes
- Copy on write: yes
- Transparent transparent compression: yes
- Transparent deduplication: yes
In RedesZone we have explained in detail this ZFS file system, both its characteristics and its configuration in different operating systems oriented to NAS servers. Now that we know which are the three most used file systems for NAS servers, let’s take a look at their strengths and weaknesses.
What file system to choose for my NAS?
Once we have seen the main characteristics of the different file systems that we can use in a home and / or professional NAS server, we are going to see the advantages and disadvantages of each of them.
The EXT4 file system is the oldest of all, and it is more than proven, so this file system is very stable, in fact, it is still the default file system for the vast majority of Linux distributions such as Debian, Ubuntu or the operating systems of QNAP, Synology and Asustor. If you need to store a large amount of data, create RAID and all that that entails, and obtain the best read / write performance with the lowest possible resource consumption, surely EXT4 meets all your needs. This file system incorporates journaling, so you should not have data loss in case of power failure, however, Btrfs and ZFS in this regard are clearly better.
Btrfs improves many negative aspects of EXT4, such as file size limitations and more, this file system uses copy-on-write and was designed for very large servers where we are going to store a lot of information, therefore, we have many advanced features that EXT4 does not incorporate, such as transparent data encryption, compression and deduplication. We must also take into account that it incorporates integrated snapshots, something that EXT4 does not have, it supports RAID and a flexible allocation of the inodes. However, it has been proven that this Btrfs file system consumes more system resources than EXT4, in addition, we will obtain less speed of reading and writing under the same conditions (same hardware and same type of files to transfer).
The ZFS file system is one of the most advanced that currently exists, this file system is similar to Btrfs but incorporates really interesting features such as the possibility of adding new devices to the current storage and adding new space immediately, making the « RAID »that we had in the other file systems. ZFS stands for scalability, large data storage capacity, protection against data corruption (integrity), and efficient data compression, deduplication, and fast snapshot capabilities. Other features are that it allows you to check the integrity on an ongoing basis and to do an automatic repair in a completely transparent way. The negative part of ZFS is that it consumes a lot of resources, especially RAM memory, in addition, if you activate deduplication you will have quite a significant additional RAM consumption.
If you have a low-mid-range NAS server it is clear that the file system you should use is EXT4, in the case of having a mid-high or high-end NAS, you can choose Btrfs or ZFS, depending on whether your system operating supports it. If you are going to use ZFS, you must bear in mind that deduplication consumes a large amount of RAM, it is a handicap that we must pay to save a large amount of storage space.