WD SMR disks with ZFS which may lead to data loss
The iXsystems company developing the Freenas project warned of serious problems with the compatibility of ZFS with some new WD Red hard drives produced by Western Digital using SMR (Shingled Magnetic Recording) technology. In the worst-case scenario, the use of ZFS on problematic drives may result in data loss.
What is WD SMR disks and how to Use.
Problems arise with the WD Red 2 to 6 TB discs released since 2018, which use the DM-SMR (Device-Managed Shingled Magnetic Recording) technology and are labeled with the EFAX label (the EFRX identifier is used for CMR discs). Western Digital noted in its blog that WD Red’s SMR drives are designed for use in NAS for homes and small businesses with no more than 8 drives and a workload of 180 TB per year typical for backup and file sharing. The previous generation of WD Red and WD Red drives with 8 TB capacity, as well as the WD Red Pro, WD Gold and WD Ultrastar series, continue to be produced using CMR (Conventional Magnetic Recording) technology and their use does not cause problems with ZFS.
The essence of SMR technology is the application of a magnetic head whose width is greater than the track width in the disk, which leads to recording with partial overlap of the adjacent track, i.e. any rewriting results in the necessity of rewriting the entire group of tracks. To optimize the work with such storage devices, zoning is used - storage space is divided into zones forming groups of blocks or sectors in which only the sequential addition of data is allowed and the whole group of blocks is updated. In general, SMR disks are more efficient in terms of energy consumption, are more affordable and demonstrate performance gains in sequential data recording, but lag behind in the execution of random write operations, including during such operations, It’s like rebuilding storage arrays.
DM-SMR implies that the operations of zoning and data distribution are controlled by a disk controller and such a disk system looks like a classic hard disk that requires no separate manipulation. DM-SMR uses indirect logic block addressing (LBA, Logical Block Addressing), which resembles logical addressing in SSD drives. After each random recording operation, a background collection operation is required, resulting in unpredictable fluctuations with performance. The system may try to apply optimization to such disks without considering that the information actually provided by the controller only determines the logical structure, and in fact the controller applies its algorithms to the previously positioned data when distributing the data. Therefore, before using the DM-SMR disks in the ZFS pool, it is recommended to reset the disks to their initial state.
What is Western digital enterprise drives.
The company Western Digital is involved in reviewing the conditions under which problems occur, and together with iXsystems is trying to find a solution and prepare firmware updates. Before the publication of the conclusions on fixing the problems, the new firmware drives are planned to be tested on the high-load repositories from Freenas 11.3 and Truenas CORE 12.0. It is claimed that due to different tractor SMR manufacturers on some types of SMR disks there are no problems with ZFS, but the undertaken iXsystems testing focuses only on WD Red disk testing based on DM technologySMR and other manufacturers' SMR disks require additional research.
Problems with ZFS have now been proven and repeated in tests at least for the 82.00A82 WD Red 4TB WD40EFAX CD-Roms, and are manifested by the transition to a failed state with a high write load, for example in the execution of repository reconstruction (resilvering). The problem is thought to be apparent in other WD Red models with the same firmware. When a problem occurs, the disk starts to return the IDNF error code and becomes unusable, which is handled in ZFS as a disk failure and may result in the loss of the data stored on the disk. If several disks fail, the data in vdev or pool may be lost. It is noted that these failures are rare - out of about a thousand sold Freenas Mini systems that were bundled with problematic disks, the problem has surfaced only once in the working environment.