• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Aaron Weiss

  • Home
  • Blog
  • About
  • Contact
Proxmox ZFS Disk Replacement and Drive Expansion

Proxmox ZFS Disk Replacement and Drive Expansion

December 30, 2021 by Aaron Weiss

Replacing a drive in a pool using the Proxmox VE 6.3 web interface doesn’t allow you to replace a disk in a pool. Instead, you’ll need to do it through the command line. I wrote this guide as I needed to perform this replacement myself. This guide can work on any ZFS system, not just Proxmox.

What’s Being Replaced

I’ve had two 1TB drives that both were purchased well over a decade ago. Roughly, when these drives were about $100 retail. So, very old. Both of these drives have been used as one of the backup pools in my Proxmox for VMs in a RAIDZ1, which is essentially a mirror, meaning, one drive can fail, and the data will be safe. Which is exactly what happened here.

Proxmox had flagged one of the drives as having some SMART errors, and marked the pool as degraded. Knowing the age of these drives, I knew it was time to replace them both.

While I had issues with purchasing White Label drives when I first built my TrueNAS server a few years ago, I chose to try it out again. The same Amazon seller, GoHardDrive, had 2 TB NAS HDD drives for $45.

Adding the new Drives

Luckily, this drive didn’t fail entirely. So I’ll be off-lining the disk first, and then replacing the devices. This will be a lengthy process because each I can only replace one disk at a time, because the Pool will resliver – or copy the data – to the new disk.

I thought about creating just another pool with these new disks and removing the old one. That’s the smarter thing to do, it really is. However, I needed to challenge myself.

I had two empty SATA drives which makes this process simpler as I only have to shutdown the device once before I add the new drives, and then again after I remove the old drives.

First, you need to find out the device and the device ids. I’m using lsblk, and grepping with “sd” to only display SATA devices.

lsblk | grep sd
sda 8:0 0 223.6G 0 disk
├─sda1 8:1 0 223.6G 0 part
└─sda9 8:9 0 8M 0 part
sdb 8:16 0 223.6G 0 disk
├─sdb1 8:17 0 223.6G 0 part
└─sdb9 8:25 0 8M 0 part
sdc 8:32 0 1.8T 0 disk
├─sdc1 8:33 0 1.8T 0 part
└─sdc9 8:41 0 8M 0 part
sdd 8:48 0 931.5G 0 disk
├─sdd1 8:49 0 931.5G 0 part
└─sdd9 8:57 0 8M 0 part
sde 8:64 0 1.8T 0 disk
sdf 8:80 0 931.5G 0 disk
├─sdf1 8:81 0 931.5G 0 part
└─sdf9 8:89 0 8M 0 part
sdg 8:96 0 111.8G 0 disk
├─sdg1 8:97 0 1007K 0 part
├─sdg2 8:98 0 512M 0 part
└─sdg3 8:99 0 111.3G 0 part
sdh 8:112 0 111.8G 0 disk
├─sdh1 8:113 0 1007K 0 part
├─sdh2 8:114 0 512M 0 part
└─sdh3 8:115 0 111.3G 0 part

I know that I’m replacing 1TB drives with 2TB drives. Therefore, the old disks are sdd and sdf, and the new disks are sdc and sde.

Now I need to find out their device-id. I used the following command for each drive by replacing the letter after sdc.

root@johnny5:~# ls -la /dev/disk/by-id | grep sdc
lrwxrwxrwx 1 root root 9 Dec 29 17:16 ata-WL2000GSA6454_WD-WMAY02272888 -> ../../sdc
lrwxrwxrwx 1 root root 10 Dec 29 17:16 ata-WL2000GSA6454_WD-WMAY02272888-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Dec 29 17:16 ata-WL2000GSA6454_WD-WMAY02272888-part9 -> ../../sdc9
lrwxrwxrwx 1 root root 9 Dec 29 17:16 wwn-0x50014ee6abcf8e35 -> ../../sdc
lrwxrwxrwx 1 root root 10 Dec 29 17:16 wwn-0x50014ee6abcf8e35-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Dec 29 17:16 wwn-0x50014ee6abcf8e35-part9 -> ../../sdc9

Now I can see that sdc’s device-id is wwn-0x50014ee6abcf8e35. I ran this three more times with all the drives:

2TB sdc wwn-0x50014ee6abcf8e35
2 TB sde wwn-0x50014ee05808492d
1 TB sdd wwn-0x50014ee2ad4b90e8
1 TB sdf wwn-0x5000c500116662ed

Now that I have this information, I can perform the following: zpool offline hdd_pool wwn-0x5000c500116662ed

This removes the old device from operation within the pool. However, the disk is still within the pool. Next, I replace the old drive with the new drive: zpool replace hdd_pool wwn-0x5000c500116662ed wwn-0x50014ee6abcf8e35

hdd_pool is the name of this pool, because its the old drives on this server that uses HDDs. This immediately starts the resilver process between the two disks in the pool and the new drive. I can watch progress using zpool status hdd_pool:

root@johnny5:~# zpool status hdd_pool
  pool: hdd_pool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Dec 29 19:15:59 2021
        471G scanned at 4.58G/s, 7.08G issued at 70.4M/s, 471G total
        7.08G resilvered, 1.50% done, 0 days 01:52:31 to go
config:
        NAME                          STATE     READ WRITE CKSUM
        hdd_pool                      DEGRADED     0     0     0
          mirror-0                    DEGRADED     0     0     0
            replacing-0               DEGRADED     0     0     0
              wwn-0x5000c500116662ed  OFFLINE      0     0     1
              wwn-0x50014ee6abcf8e35  ONLINE       0     0     0  (resilvering)
            wwn-0x50014ee2ad4b90e8    ONLINE       0     0     0
errors: No known data errors

However, you can watch the progress of the resilvering within the Proxmox UI. Go to Datacenter > Node > Disks > ZFS. Then double-click the pool in question.

Proxmox ZFS Pool Detail
Example of Proxmox’s ZFS Pool Details and the resilvering process.

It took about 2.5 hours to resilver each drive. Of course, now I have to do this whole thing again with the other two drives:

zpool offline hdd_pool wwn-0x50014ee2ad4b90e8;zpool replace hdd_pool wwn-0x50014ee2ad4b90e8 wwn-0x50014ee05808492d
Another 2 hours later, and another zpool status hdd_pool displays:

root@johnny5:~# zpool status hdd_pool 
  pool: hdd_pool 
 state: ONLINE 
  scan: resilvered 472G in 0 days 01:13:55 with 0 errors on Wed Dec 29 22:46:55 2021 
config:
        NAME                        STATE     READ WRITE CKSUM
        hdd_pool                    ONLINE       0     0     0
          mirror-0                  ONLINE       0     0     0
            wwn-0x50014ee6abcf8e35  ONLINE       0     0     0
            wwn-0x50014ee05808492d  ONLINE       0     0     0
errors: No known data errors

Now, our two new disks have fully resilvered and our pool is working as expected.

Expand Disks to Use All Available Disk Space

Next, we need to expand the disks to use all the available disk space. Otherwise, the pool will only continue to use the 1 TB of space that the original two drives offered:

Proxmox ZFS Disk Pre-expanded
Example of the pool working, but only showing the original usable disk space in the pool
root@johnny5:~# zpool list
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
hdd_pool  928G 471G   472G  1.35T        -         -     1%    25%  1.00x    ONLINE  -
rpool      111G  17.0G  94.0G        -         -     6%    15%  1.00x    ONLINE  -
sdd_pool   222G  70.9G   151G        -         -    36%    31%  1.00x    ONLINE  -
vm_pool   3.62T   589G  3.05T        -         -    22%    15%  1.00x    ONLINE  -

To do this, I ran the following:

zpool online -e hdd_pool wwn-0x50014ee6abcf8e35
zpool online -e hdd_pool wwn-0x50014ee05808492d

This command expands the disks within the hdd_pool for each of the device-ids.

root@johnny5:~# zpool list
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
hdd_pool  1.81T   472G  1.35T        -         -     1%    25%  1.00x    ONLINE  -
rpool      111G  17.0G  94.0G        -         -     6%    15%  1.00x    ONLINE  -
sdd_pool   222G  70.9G   151G        -         -    36%    31%  1.00x    ONLINE  -
vm_pool   3.62T   589G  3.05T        -         -    22%    15%  1.00x    ONLINE  -

Now you can see that hdd_pool has expanded to use both disks fully:

Proxmox Pool expanded
Pool now uses all the available disk space of the new disks.

Run SMART Tests

Honestly, this should have been the first step, but I forgot to do it. This step runs a long SMART test:

smartctl -t long /dev/sdc; smartctl -t long /dev/sde

Please wait 295 minutes for test to complete.

This command runs the SMART long test on both the sdc and sde disks, which are the two new disks. While this is running, I decided to just run a quick report on both of them:

smartctl -a /dev/sdc;smartctl -a /dev/sde

=== START OF INFORMATION SECTION ===
Device Model: WL2000GSA6454
Serial Number: WD-WMAY02097019
LU WWN Device Id: 5 0014ee 05808492d
Firmware Version: 00.0NS03
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Wed Dec 29 23:22:50 2021 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

The areas marked in bold are the values that show that I received the wrong disks. I was supposed to receive two 2TB SATA III drives at 5400RPM. This is the second time that I recieved drives that were incorrectly described by this White Label drive manufacturer. I sent the seller on Amazon a questing why this states this, so I’ll wait to hear back.

Conclusion

Regardless of the incorrectly described product, these two drives are working, quiet, and have increased my backup capabilities. This pool is strictly used to as destination for VM backups. The same backups are also sent to my TrueNAS device which is my primary backup destination.

I hope this article was helpful in replacing your ZFS pool with new disks.

Filed Under: Proxmox Tagged With: backups, hard drive, hdd, proxmox, resilvering, zfs, zfs mirror

About Aaron Weiss

Aaron Weiss has been working with search engine optimization since 2008, and has been certified twice in the field. He enjoys Google Analytics, website administration and management, and looking for ways to improve his skillset. He has a Masters degree in Cinema Studies from the Savannah College of Art and Design, and learned the film criticism and website administration ropes with his movie review website, CinemaFunk. When not nerding out about websites, he's nerding out about guitar, other tech, and spending time with his girlfriend and dog.

Disclosure

This blog post is the opinion of the author and no other person or organization, except where indicated in quotations and citations. This article may include affiliate links and I may recieve compensation for the purchase of goods or services found at those links. For more information, please see this site's affiliate disclosure and privacy policy.

No Comments?

The time it takes to moderate and fight spam is too great. Therefore, the decision disable comments was made. If you have a question or have a comment about this blog or any other article on this website, please contact me.

Primary Sidebar

Recent Posts

  • TrueNAS Virtual Machine with Ubuntu 18.04 Halting with Signal 11
  • Proxmox ZFS Disk Replacement and Drive Expansion
  • How To Create ZFS Backups in Proxmox
  • Multiple UPS on the same NUT-Server
  • Learning Graylog for Fun and Profit

Categories

  • Film & Television
  • Google Analytics
  • Guitar
  • Lifestyle
  • Projects
  • Proxmox
  • Search Engine Optimization
  • Technology
  • TrueNAS
  • Uncategorized
  • Website Administration
  • WordPress
  • Home
  • Blog
  • About Aaron Weiss
  • Contact
  • Privacy Policy
  • Affiliate Disclosure

© Aaron Weiss. Built with WordPress using the Genesis Framework. Hosted on a2 Hosting and DigitalOcean. Aaron Weiss's LinkedIn Profile