Testing New Drives
To set up my network attached storage (NAS), I recently ordered two 4 TB
WD Red drives. After some research
(here
and
here)
I came up with the protocol below. Note: run all commands as root and
replace /dev/disk
with the appropriate device name for your setup:
- read out S.M.A.R.T attributes:
smartctl -a /dev/disk > baseline
- perform a conveyance test to check for damages during transport:
smartctl -t conveyance /dev/disk
and compare to baseline - perform a short test:
smartctl -t short /dev/disk
and compare to baseline - run badblocks on complete disk:
badblocks -wsv -b 4096 -t random -o badblocks.txt /dev/disk
, monitor temperature and compare to baseline. This took around 7 hours for my drives. - another short test:
smartctl -t short /dev/disk
to see if errors came up - finally, perform a long test:
smartctl -t long /dev/disk
and check attributes one last time. You can see the estimated runtime of the test withsmartctl -c /dev/disk
After all these tests, if none of the critical attributes were affected, we can put the drive into production. If not, send it back. In my case, both drives passed the tests.
Compare to baseline
After having read the S.M.A.R.T attributes on the fresh disk in Step 1.
above, and saved to a file called baseline
, we can compare the
attributes to this baseline after each test with the following code:
smartctl -a /dev/disk > attributes diff baseline attributes
Check for any differences, especially in the following attributes:
- Raw_Read_Error_Rate
- Reallocated_Sector_Ct
- Seek_Error_Rate
- Reallocated_Event_Count
- Current_Pending_Sector
- Offline_Uncorrectable
- UDMA_CRC_Error_Count
If any values in the column RAW_VALUE get/are above 0, return the disk.
Monitor temperature
Some of the tests, especially badblocks
stress the drive and increase
its running temperature. It is important to have enough cooling and to
monitor disk temperature regularly (e.g. every 4 hours) using the
following command:
smartctl -l scttemp /dev/disk
The temperature should never rise above the maximum recommended temperature indicated in the output of that command (65 degrees Celsius for my drives). If it reaches 5°C less than that maximim temperature, immediately abort the test and provide the drive with better cooling. Then resume testing.